#archiveteam-bs 2019-12-15,Sun

↑back Search

Time Nickname Message
00:07 🔗 bluefoo_ anyone know if archive.org actuallly manages to archive twitter content or if its just the javascript loading it
00:08 🔗 JAA No idea what the WBM SPN does (though I'd assume it works), but the ArchiveBot grabs seem to play back fine.
00:08 🔗 JAA Once Twitter rolls out the redesign, that will almost certainly change.
00:11 🔗 bluefoo_ whats an SPN
00:11 🔗 Kaz save page now
00:15 🔗 kiiwii I found this other piece of software called gopherbot, I think I may wanna use this for archiving gopherspace but I don't know how to execute haskell code. gopher://gopherspace.de:70/1/menu/Downloads/Gopher_Querying/gopherbot/
00:19 🔗 bluefoo_ ahh
00:19 🔗 bluefoo_ kiiwii: i dont even know how to open that link
00:19 🔗 bluefoo_ kiiwii: is it just raw haskell code?
00:19 🔗 bluefoo_ kiiwii: is there a setup.hs file?
00:20 🔗 bluefoo_ if you need to compile it, you probably need to install cabal-install, which is the haskell build tool
00:20 🔗 kiiwii There's a setup.lhs file
00:21 🔗 kiiwii Config.hs COPYING.txt COPYRIGHT.txt DB.hs DBProcs.hs DirParser.hs gopherbot.cabal.txt gopherbot.hs Makefile.txt NetClient.hs RobotsTxt.hs Setup.lhs Types.hs Utils.hs
00:21 🔗 jodizzle has quit IRC (Quit: ZNC 1.7.1 - https://znc.in)
00:21 🔗 kiiwii Those are the files
00:21 🔗 jodizzle has joined #archiveteam-bs
00:42 🔗 godane SketchCow: film score monthly is getting uploaded: https://archive.org/details/Film_Score_Monthly_Volume_01_Issue_02_1990_06_Vineyard_Haven_US
01:02 🔗 SoraUta has joined #archiveteam-bs
01:09 🔗 superkuh_ has joined #archiveteam-bs
02:46 🔗 kiiwii Turns out the gopherbot code is over a decade old meaning it won't compile on a current machine :/
02:46 🔗 kiiwii So I'm just gonna have to continue using the python code
02:46 🔗 Wingy Can you use wget?
02:46 🔗 Wingy Oh I think only curl supports gopher
02:47 🔗 Wingy So are you trying to archive *every* gopher server?
02:47 🔗 kiiwii Every gopherhole, yeah
02:47 🔗 markedL why was gopherbot better?
02:47 🔗 kiiwii I wasn't sure, I wanted to try it out
02:48 🔗 kiiwii But with the way I'm doing it, I have to type in every site manually.
02:48 🔗 kiiwii And there's ~300 gopherholes out there
02:49 🔗 Wingy Can you recursively download gopher://gopher.quux.org/1/Software/Gopher/servers?
02:49 🔗 kiiwii I can try.
02:49 🔗 markedL we can put the python code in to the warriors
02:50 🔗 kiiwii How can I send you the file? Or should I send the link to the github repo?
02:50 🔗 markedL though, really, could just script it also if you don't need more than a few IPs
02:54 🔗 kiiwii Yeah, I can't download recursivley from gopher://gopher.quux.org/1/Software/Gopher/servers
03:24 🔗 cerca has quit IRC (Remote host closed the connection)
04:08 🔗 Flashfire bellsouth.net should probably be scraped at some stage a lot of websites hosted on it
04:17 🔗 kiiwii Starting to archive gopher.quux.org, this may be a big gopherhole
04:17 🔗 odemgi_ has joined #archiveteam-bs
04:22 🔗 odemgi has quit IRC (Read error: Operation timed out)
04:30 🔗 kiska has quit IRC (Remote host closed the connection)
04:30 🔗 Flashfire has quit IRC (Remote host closed the connection)
04:31 🔗 Flashfire has joined #archiveteam-bs
04:31 🔗 kiska has joined #archiveteam-bs
04:31 🔗 Flashfire Kiska you reset it did you?
04:31 🔗 svchfoo1 sets mode: +o kiska
04:31 🔗 svchfoo3 sets mode: +o kiska
04:32 🔗 kiska Huh?
04:43 🔗 superkuh_ has quit IRC (Quit: the neuronal action potential is an electrical manipulation of reversible abrupt phase changes in the lipid bilaye)
04:45 🔗 stapler11 has joined #archiveteam-bs
04:47 🔗 odemgi has joined #archiveteam-bs
04:51 🔗 odemgi_ has quit IRC (Read error: Connection reset by peer)
04:55 🔗 qw3rty has joined #archiveteam-bs
05:04 🔗 qw3rty2 has quit IRC (Ping timeout: 745 seconds)
05:05 🔗 DogsRNice has quit IRC (Read error: Connection reset by peer)
05:13 🔗 SketchCow I'm happy to report FOS is running "only" about 24 hours behind uploading Archivebot grabs.
05:58 🔗 godane SketchCow: so more interesting cover art of japanese manuals are coming
05:59 🔗 godane mostly cause of Hitachi Microwave ovens in 96xxx area
06:09 🔗 HP_Archiv has joined #archiveteam-bs
06:11 🔗 Jopik has quit IRC (Read error: Connection reset by peer)
06:11 🔗 Jopik has joined #archiveteam-bs
06:30 🔗 SoraUta has quit IRC (Remote host closed the connection)
06:31 🔗 SoraUta has joined #archiveteam-bs
07:54 🔗 killsushi has quit IRC (Quit: Leaving)
08:36 🔗 stapler11 has quit IRC (Read error: Connection reset by peer)
08:46 🔗 LowLevelM has quit IRC (Read error: Operation timed out)
08:47 🔗 LowLevelM has joined #archiveteam-bs
09:05 🔗 LowLevelM has quit IRC (Read error: Operation timed out)
09:34 🔗 LowLevelM has joined #archiveteam-bs
09:47 🔗 trc has joined #archiveteam-bs
10:15 🔗 kiska18 has quit IRC (Remote host closed the connection)
10:15 🔗 Ryz has quit IRC (Remote host closed the connection)
10:15 🔗 kiska18 has joined #archiveteam-bs
10:16 🔗 Ryz has joined #archiveteam-bs
10:16 🔗 svchfoo3 sets mode: +o kiska18
10:16 🔗 svchfoo1 sets mode: +o kiska18
10:48 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
10:51 🔗 Atom__ has joined #archiveteam-bs
10:57 🔗 Atom-- has quit IRC (Read error: Operation timed out)
11:01 🔗 tech234a has quit IRC (Quit: Connection closed for inactivity)
11:41 🔗 cerca has joined #archiveteam-bs
11:57 🔗 LeighR has joined #archiveteam-bs
12:09 🔗 ephemer0l has joined #archiveteam-bs
12:20 🔗 Wingy has quit IRC (Remote host closed the connection)
12:42 🔗 ephemer0l has quit IRC (Ping timeout: 745 seconds)
12:44 🔗 Ravenloft has quit IRC (Read error: Operation timed out)
12:53 🔗 qwebirc60 has quit IRC (Quit: Page closed)
12:53 🔗 InkArchiv has joined #archiveteam-bs
13:21 🔗 SoraUta has quit IRC (Read error: Operation timed out)
13:35 🔗 SoraUta has joined #archiveteam-bs
13:41 🔗 chazchaz has quit IRC (Read error: Operation timed out)
13:41 🔗 chazchaz has joined #archiveteam-bs
13:46 🔗 SoraUta has quit IRC (Read error: Operation timed out)
13:47 🔗 Sauce has joined #archiveteam-bs
13:47 🔗 Sauce is now known as amdk6
14:10 🔗 amdk6 has quit IRC (C:\exit.exe)
14:14 🔗 superkuh_ has joined #archiveteam-bs
14:25 🔗 Wingy has joined #archiveteam-bs
14:27 🔗 Wingy has quit IRC (Client Quit)
14:27 🔗 Wingy has joined #archiveteam-bs
14:37 🔗 ephemer0l has joined #archiveteam-bs
15:15 🔗 SootBectr re; https://archiveteam.org/index.php?title=Warrior#Can_I_use_whatever_internet_access_for_the_warrior.3F I assume that Pi-Hole (or any other DNS blocklist setup) should not be used? If that's the case it could be made explicit there.
15:19 🔗 JAA Correct, such things should not be used with workers.
15:20 🔗 Wingy JAA: Do you know if anyone responded re: my wiki account?
15:21 🔗 JAA Wingy: jrwr can't do it, so you'll have to wait for SketchCow.
15:21 🔗 Wingy Okay thanks :)
15:25 🔗 SootBectr Thanks JAA
15:26 🔗 Wingy JAA: Should I change the warrior-dockerfile to always use 1.1.1.1 or 8.8.8.8?
15:26 🔗 Wingy (and submit as PR ofc)
15:30 🔗 LeighR Wingy: ooo, that's a good idea
15:30 🔗 SootBectr Some people make their router redirect all DNS traffic to their chosen server, so it may be worth mentioning on the wiki.
15:30 🔗 LeighR Wingy: at the very least, it should be an ENV, and set to 1.1.1.1 or 8.8.8.8 as the default
15:39 🔗 Kaz yeah, shoot it through
15:42 🔗 InkArchiv has quit IRC (Quit: Page closed)
16:07 🔗 trc has quit IRC (Quit: Goodbye)
16:44 🔗 godane SketchCow: this may interest you : http://publ.lib.ru/ARCHIVES/
16:44 🔗 godane tons of russian stuff
16:45 🔗 JAA Looks like there was an AB job for all of http://www.publ.lib.ru/ two years ago.
17:09 🔗 i0npulse has quit IRC (Ping timeout: 248 seconds)
17:31 🔗 godane ok
17:54 🔗 i0npulse has joined #archiveteam-bs
18:20 🔗 tech234a has joined #archiveteam-bs
18:51 🔗 schbirid has joined #archiveteam-bs
19:01 🔗 Ravenloft has joined #archiveteam-bs
19:14 🔗 LeighR has quit IRC (Ping timeout: 260 seconds)
19:21 🔗 Stilettoo has joined #archiveteam-bs
19:22 🔗 Stiletto has quit IRC (Read error: Operation timed out)
19:39 🔗 kiiwii has quit IRC (Quit: Konversation terminated!)
19:39 🔗 kiiwii has joined #archiveteam-bs
19:55 🔗 Ravenloft has quit IRC (Read error: Operation timed out)
20:30 🔗 SoraUta has joined #archiveteam-bs
20:51 🔗 X-Scale` has joined #archiveteam-bs
20:59 🔗 X-Scale has quit IRC (Ping timeout: 610 seconds)
20:59 🔗 X-Scale` is now known as X-Scale
21:03 🔗 BlueMax has joined #archiveteam-bs
21:07 🔗 killsushi has joined #archiveteam-bs
21:14 🔗 schbirid has quit IRC (Quit: Leaving)
21:25 🔗 ephemer0l has quit IRC (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.)
21:46 🔗 ephemer0l has joined #archiveteam-bs
21:50 🔗 mtntmnky has quit IRC (Remote host closed the connection)
21:50 🔗 mtntmnky has joined #archiveteam-bs
22:01 🔗 ShellyRol has quit IRC (Read error: Connection reset by peer)
22:03 🔗 ShellyRol has joined #archiveteam-bs
22:05 🔗 Stilettoo has quit IRC (Read error: Operation timed out)
22:08 🔗 Stiletto has joined #archiveteam-bs
22:08 🔗 kode54 has quit IRC (Quit: Ping timeout (120 seconds))
22:18 🔗 kode54 has joined #archiveteam-bs
22:39 🔗 OrIdow6 http://www.freedb.org/en/download__database.10.html "Here you can download the freedb database."
22:39 🔗 OrIdow6 1st mirror doesn't work for me, but 2nd does - updates as recently as half a month ago
22:42 🔗 godane SketchCow: So i found more scans of macformat magazine
22:42 🔗 godane there on macintoshgarden.org
22:49 🔗 DFJustin has quit IRC (Remote host closed the connection)
22:52 🔗 JAA So yeah, we can throw freedb into AB, but I don't think it'll grab much.
22:53 🔗 DFJustin has joined #archiveteam-bs
22:53 🔗 JAA We'd need to generate all the possible URLs from the DB file probably.
22:53 🔗 JAA And then grab those with !ao <.
22:53 🔗 JAA (Or some other method if there are too many.)
22:58 🔗 Stilettoo has joined #archiveteam-bs
22:58 🔗 arkiver JAA: yeah, we can do a little project for it
22:59 🔗 arkiver isn't there many many millions of URLs/
22:59 🔗 arkiver ?
22:59 🔗 Stiletto has quit IRC (Read error: Operation timed out)
22:59 🔗 OrIdow6 Assuming records are evenly distributed in the tars, there are something like 500 000 records
23:00 🔗 OrIdow6 Hmm, that must be wrong, though
23:01 🔗 OrIdow6 Wikipedia says 2 000 000
23:01 🔗 JAA ... in 2006
23:01 🔗 OrIdow6 Yes
23:04 🔗 JAA There are 2**32 possible CDDB IDs, so we could in theory bruteforce that, but let's not.
23:08 🔗 JAA Checking the latest .tar.bz2 now.
23:14 🔗 JAA > lsar freedb-complete-20191203.tar.bz2 | grep -c '^data/[0-9a-f]\{8\}'
23:14 🔗 JAA 117667
23:14 🔗 JAA Uhm...
23:17 🔗 Stiletto has joined #archiveteam-bs
23:19 🔗 Stilettoo has quit IRC (Ping timeout: 258 seconds)
23:23 🔗 JAA Oh, sections, nvm.
23:24 🔗 JAA 3923817 entries
23:24 🔗 JAA Easy
23:25 🔗 JAA I'll qwarc this.
23:26 🔗 OrIdow6 As I'm reading it, each request involves the client's username, hostname, & software name & version
23:26 🔗 JAA Yeah, just found that as well.
23:27 🔗 JAA arkiver: ^ That might need a patch in the WBM if we want it to be possible to just plug the WBM into tools to continue using the freedb database from there.
23:35 🔗 JAA URLs look like this: http://freedb.freedb.org/~cddb/cddb.cgi?cmd=cddb+query+21037703+3+150+21592+47662+889&hello=user+host+application+v0.0&proto=6 http://freedb.freedb.org/~cddb/cddb.cgi?cmd=cddb+read+data+21037703&hello=user+host+application+v0.0&proto=6
23:35 🔗 JAA CDDB documentation available at http://ftp.freedb.org/pub/freedb/latest/CDDBPROTO
23:36 🔗 JAA Essentially, the "hello" parameter would have to be ignored in the WBM.
23:39 🔗 OrIdow6 has quit IRC (Remote host closed the connection)
23:45 🔗 anarcat you guys rock

irclogger-viewer