Time |
Nickname |
Message |
00:07
🔗
|
bluefoo_ |
anyone know if archive.org actuallly manages to archive twitter content or if its just the javascript loading it |
00:08
🔗
|
JAA |
No idea what the WBM SPN does (though I'd assume it works), but the ArchiveBot grabs seem to play back fine. |
00:08
🔗
|
JAA |
Once Twitter rolls out the redesign, that will almost certainly change. |
00:11
🔗
|
bluefoo_ |
whats an SPN |
00:11
🔗
|
Kaz |
save page now |
00:15
🔗
|
kiiwii |
I found this other piece of software called gopherbot, I think I may wanna use this for archiving gopherspace but I don't know how to execute haskell code. gopher://gopherspace.de:70/1/menu/Downloads/Gopher_Querying/gopherbot/ |
00:19
🔗
|
bluefoo_ |
ahh |
00:19
🔗
|
bluefoo_ |
kiiwii: i dont even know how to open that link |
00:19
🔗
|
bluefoo_ |
kiiwii: is it just raw haskell code? |
00:19
🔗
|
bluefoo_ |
kiiwii: is there a setup.hs file? |
00:20
🔗
|
bluefoo_ |
if you need to compile it, you probably need to install cabal-install, which is the haskell build tool |
00:20
🔗
|
kiiwii |
There's a setup.lhs file |
00:21
🔗
|
kiiwii |
Config.hs COPYING.txt COPYRIGHT.txt DB.hs DBProcs.hs DirParser.hs gopherbot.cabal.txt gopherbot.hs Makefile.txt NetClient.hs RobotsTxt.hs Setup.lhs Types.hs Utils.hs |
00:21
🔗
|
|
jodizzle has quit IRC (Quit: ZNC 1.7.1 - https://znc.in) |
00:21
🔗
|
kiiwii |
Those are the files |
00:21
🔗
|
|
jodizzle has joined #archiveteam-bs |
00:42
🔗
|
godane |
SketchCow: film score monthly is getting uploaded: https://archive.org/details/Film_Score_Monthly_Volume_01_Issue_02_1990_06_Vineyard_Haven_US |
01:02
🔗
|
|
SoraUta has joined #archiveteam-bs |
01:09
🔗
|
|
superkuh_ has joined #archiveteam-bs |
02:46
🔗
|
kiiwii |
Turns out the gopherbot code is over a decade old meaning it won't compile on a current machine :/ |
02:46
🔗
|
kiiwii |
So I'm just gonna have to continue using the python code |
02:46
🔗
|
Wingy |
Can you use wget? |
02:46
🔗
|
Wingy |
Oh I think only curl supports gopher |
02:47
🔗
|
Wingy |
So are you trying to archive *every* gopher server? |
02:47
🔗
|
kiiwii |
Every gopherhole, yeah |
02:47
🔗
|
markedL |
why was gopherbot better? |
02:47
🔗
|
kiiwii |
I wasn't sure, I wanted to try it out |
02:48
🔗
|
kiiwii |
But with the way I'm doing it, I have to type in every site manually. |
02:48
🔗
|
kiiwii |
And there's ~300 gopherholes out there |
02:49
🔗
|
Wingy |
Can you recursively download gopher://gopher.quux.org/1/Software/Gopher/servers? |
02:49
🔗
|
kiiwii |
I can try. |
02:49
🔗
|
markedL |
we can put the python code in to the warriors |
02:50
🔗
|
kiiwii |
How can I send you the file? Or should I send the link to the github repo? |
02:50
🔗
|
markedL |
though, really, could just script it also if you don't need more than a few IPs |
02:54
🔗
|
kiiwii |
Yeah, I can't download recursivley from gopher://gopher.quux.org/1/Software/Gopher/servers |
03:24
🔗
|
|
cerca has quit IRC (Remote host closed the connection) |
04:08
🔗
|
Flashfire |
bellsouth.net should probably be scraped at some stage a lot of websites hosted on it |
04:17
🔗
|
kiiwii |
Starting to archive gopher.quux.org, this may be a big gopherhole |
04:17
🔗
|
|
odemgi_ has joined #archiveteam-bs |
04:22
🔗
|
|
odemgi has quit IRC (Read error: Operation timed out) |
04:30
🔗
|
|
kiska has quit IRC (Remote host closed the connection) |
04:30
🔗
|
|
Flashfire has quit IRC (Remote host closed the connection) |
04:31
🔗
|
|
Flashfire has joined #archiveteam-bs |
04:31
🔗
|
|
kiska has joined #archiveteam-bs |
04:31
🔗
|
Flashfire |
Kiska you reset it did you? |
04:31
🔗
|
|
svchfoo1 sets mode: +o kiska |
04:31
🔗
|
|
svchfoo3 sets mode: +o kiska |
04:32
🔗
|
kiska |
Huh? |
04:43
🔗
|
|
superkuh_ has quit IRC (Quit: the neuronal action potential is an electrical manipulation of reversible abrupt phase changes in the lipid bilaye) |
04:45
🔗
|
|
stapler11 has joined #archiveteam-bs |
04:47
🔗
|
|
odemgi has joined #archiveteam-bs |
04:51
🔗
|
|
odemgi_ has quit IRC (Read error: Connection reset by peer) |
04:55
🔗
|
|
qw3rty has joined #archiveteam-bs |
05:04
🔗
|
|
qw3rty2 has quit IRC (Ping timeout: 745 seconds) |
05:05
🔗
|
|
DogsRNice has quit IRC (Read error: Connection reset by peer) |
05:13
🔗
|
SketchCow |
I'm happy to report FOS is running "only" about 24 hours behind uploading Archivebot grabs. |
05:58
🔗
|
godane |
SketchCow: so more interesting cover art of japanese manuals are coming |
05:59
🔗
|
godane |
mostly cause of Hitachi Microwave ovens in 96xxx area |
06:09
🔗
|
|
HP_Archiv has joined #archiveteam-bs |
06:11
🔗
|
|
Jopik has quit IRC (Read error: Connection reset by peer) |
06:11
🔗
|
|
Jopik has joined #archiveteam-bs |
06:30
🔗
|
|
SoraUta has quit IRC (Remote host closed the connection) |
06:31
🔗
|
|
SoraUta has joined #archiveteam-bs |
07:54
🔗
|
|
killsushi has quit IRC (Quit: Leaving) |
08:36
🔗
|
|
stapler11 has quit IRC (Read error: Connection reset by peer) |
08:46
🔗
|
|
LowLevelM has quit IRC (Read error: Operation timed out) |
08:47
🔗
|
|
LowLevelM has joined #archiveteam-bs |
09:05
🔗
|
|
LowLevelM has quit IRC (Read error: Operation timed out) |
09:34
🔗
|
|
LowLevelM has joined #archiveteam-bs |
09:47
🔗
|
|
trc has joined #archiveteam-bs |
10:15
🔗
|
|
kiska18 has quit IRC (Remote host closed the connection) |
10:15
🔗
|
|
Ryz has quit IRC (Remote host closed the connection) |
10:15
🔗
|
|
kiska18 has joined #archiveteam-bs |
10:16
🔗
|
|
Ryz has joined #archiveteam-bs |
10:16
🔗
|
|
svchfoo3 sets mode: +o kiska18 |
10:16
🔗
|
|
svchfoo1 sets mode: +o kiska18 |
10:48
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
10:51
🔗
|
|
Atom__ has joined #archiveteam-bs |
10:57
🔗
|
|
Atom-- has quit IRC (Read error: Operation timed out) |
11:01
🔗
|
|
tech234a has quit IRC (Quit: Connection closed for inactivity) |
11:41
🔗
|
|
cerca has joined #archiveteam-bs |
11:57
🔗
|
|
LeighR has joined #archiveteam-bs |
12:09
🔗
|
|
ephemer0l has joined #archiveteam-bs |
12:20
🔗
|
|
Wingy has quit IRC (Remote host closed the connection) |
12:42
🔗
|
|
ephemer0l has quit IRC (Ping timeout: 745 seconds) |
12:44
🔗
|
|
Ravenloft has quit IRC (Read error: Operation timed out) |
12:53
🔗
|
|
qwebirc60 has quit IRC (Quit: Page closed) |
12:53
🔗
|
|
InkArchiv has joined #archiveteam-bs |
13:21
🔗
|
|
SoraUta has quit IRC (Read error: Operation timed out) |
13:35
🔗
|
|
SoraUta has joined #archiveteam-bs |
13:41
🔗
|
|
chazchaz has quit IRC (Read error: Operation timed out) |
13:41
🔗
|
|
chazchaz has joined #archiveteam-bs |
13:46
🔗
|
|
SoraUta has quit IRC (Read error: Operation timed out) |
13:47
🔗
|
|
Sauce has joined #archiveteam-bs |
13:47
🔗
|
|
Sauce is now known as amdk6 |
14:10
🔗
|
|
amdk6 has quit IRC (C:\exit.exe) |
14:14
🔗
|
|
superkuh_ has joined #archiveteam-bs |
14:25
🔗
|
|
Wingy has joined #archiveteam-bs |
14:27
🔗
|
|
Wingy has quit IRC (Client Quit) |
14:27
🔗
|
|
Wingy has joined #archiveteam-bs |
14:37
🔗
|
|
ephemer0l has joined #archiveteam-bs |
15:15
🔗
|
SootBectr |
re; https://archiveteam.org/index.php?title=Warrior#Can_I_use_whatever_internet_access_for_the_warrior.3F I assume that Pi-Hole (or any other DNS blocklist setup) should not be used? If that's the case it could be made explicit there. |
15:19
🔗
|
JAA |
Correct, such things should not be used with workers. |
15:20
🔗
|
Wingy |
JAA: Do you know if anyone responded re: my wiki account? |
15:21
🔗
|
JAA |
Wingy: jrwr can't do it, so you'll have to wait for SketchCow. |
15:21
🔗
|
Wingy |
Okay thanks :) |
15:25
🔗
|
SootBectr |
Thanks JAA |
15:26
🔗
|
Wingy |
JAA: Should I change the warrior-dockerfile to always use 1.1.1.1 or 8.8.8.8? |
15:26
🔗
|
Wingy |
(and submit as PR ofc) |
15:30
🔗
|
LeighR |
Wingy: ooo, that's a good idea |
15:30
🔗
|
SootBectr |
Some people make their router redirect all DNS traffic to their chosen server, so it may be worth mentioning on the wiki. |
15:30
🔗
|
LeighR |
Wingy: at the very least, it should be an ENV, and set to 1.1.1.1 or 8.8.8.8 as the default |
15:39
🔗
|
Kaz |
yeah, shoot it through |
15:42
🔗
|
|
InkArchiv has quit IRC (Quit: Page closed) |
16:07
🔗
|
|
trc has quit IRC (Quit: Goodbye) |
16:44
🔗
|
godane |
SketchCow: this may interest you : http://publ.lib.ru/ARCHIVES/ |
16:44
🔗
|
godane |
tons of russian stuff |
16:45
🔗
|
JAA |
Looks like there was an AB job for all of http://www.publ.lib.ru/ two years ago. |
17:09
🔗
|
|
i0npulse has quit IRC (Ping timeout: 248 seconds) |
17:31
🔗
|
godane |
ok |
17:54
🔗
|
|
i0npulse has joined #archiveteam-bs |
18:20
🔗
|
|
tech234a has joined #archiveteam-bs |
18:51
🔗
|
|
schbirid has joined #archiveteam-bs |
19:01
🔗
|
|
Ravenloft has joined #archiveteam-bs |
19:14
🔗
|
|
LeighR has quit IRC (Ping timeout: 260 seconds) |
19:21
🔗
|
|
Stilettoo has joined #archiveteam-bs |
19:22
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
19:39
🔗
|
|
kiiwii has quit IRC (Quit: Konversation terminated!) |
19:39
🔗
|
|
kiiwii has joined #archiveteam-bs |
19:55
🔗
|
|
Ravenloft has quit IRC (Read error: Operation timed out) |
20:30
🔗
|
|
SoraUta has joined #archiveteam-bs |
20:51
🔗
|
|
X-Scale` has joined #archiveteam-bs |
20:59
🔗
|
|
X-Scale has quit IRC (Ping timeout: 610 seconds) |
20:59
🔗
|
|
X-Scale` is now known as X-Scale |
21:03
🔗
|
|
BlueMax has joined #archiveteam-bs |
21:07
🔗
|
|
killsushi has joined #archiveteam-bs |
21:14
🔗
|
|
schbirid has quit IRC (Quit: Leaving) |
21:25
🔗
|
|
ephemer0l has quit IRC (Quit: http://quassel-irc.org - Chat comfortably. Anywhere.) |
21:46
🔗
|
|
ephemer0l has joined #archiveteam-bs |
21:50
🔗
|
|
mtntmnky has quit IRC (Remote host closed the connection) |
21:50
🔗
|
|
mtntmnky has joined #archiveteam-bs |
22:01
🔗
|
|
ShellyRol has quit IRC (Read error: Connection reset by peer) |
22:03
🔗
|
|
ShellyRol has joined #archiveteam-bs |
22:05
🔗
|
|
Stilettoo has quit IRC (Read error: Operation timed out) |
22:08
🔗
|
|
Stiletto has joined #archiveteam-bs |
22:08
🔗
|
|
kode54 has quit IRC (Quit: Ping timeout (120 seconds)) |
22:18
🔗
|
|
kode54 has joined #archiveteam-bs |
22:39
🔗
|
OrIdow6 |
http://www.freedb.org/en/download__database.10.html "Here you can download the freedb database." |
22:39
🔗
|
OrIdow6 |
1st mirror doesn't work for me, but 2nd does - updates as recently as half a month ago |
22:42
🔗
|
godane |
SketchCow: So i found more scans of macformat magazine |
22:42
🔗
|
godane |
there on macintoshgarden.org |
22:49
🔗
|
|
DFJustin has quit IRC (Remote host closed the connection) |
22:52
🔗
|
JAA |
So yeah, we can throw freedb into AB, but I don't think it'll grab much. |
22:53
🔗
|
|
DFJustin has joined #archiveteam-bs |
22:53
🔗
|
JAA |
We'd need to generate all the possible URLs from the DB file probably. |
22:53
🔗
|
JAA |
And then grab those with !ao <. |
22:53
🔗
|
JAA |
(Or some other method if there are too many.) |
22:58
🔗
|
|
Stilettoo has joined #archiveteam-bs |
22:58
🔗
|
arkiver |
JAA: yeah, we can do a little project for it |
22:59
🔗
|
arkiver |
isn't there many many millions of URLs/ |
22:59
🔗
|
arkiver |
? |
22:59
🔗
|
|
Stiletto has quit IRC (Read error: Operation timed out) |
22:59
🔗
|
OrIdow6 |
Assuming records are evenly distributed in the tars, there are something like 500 000 records |
23:00
🔗
|
OrIdow6 |
Hmm, that must be wrong, though |
23:01
🔗
|
OrIdow6 |
Wikipedia says 2 000 000 |
23:01
🔗
|
JAA |
... in 2006 |
23:01
🔗
|
OrIdow6 |
Yes |
23:04
🔗
|
JAA |
There are 2**32 possible CDDB IDs, so we could in theory bruteforce that, but let's not. |
23:08
🔗
|
JAA |
Checking the latest .tar.bz2 now. |
23:14
🔗
|
JAA |
> lsar freedb-complete-20191203.tar.bz2 | grep -c '^data/[0-9a-f]\{8\}' |
23:14
🔗
|
JAA |
117667 |
23:14
🔗
|
JAA |
Uhm... |
23:17
🔗
|
|
Stiletto has joined #archiveteam-bs |
23:19
🔗
|
|
Stilettoo has quit IRC (Ping timeout: 258 seconds) |
23:23
🔗
|
JAA |
Oh, sections, nvm. |
23:24
🔗
|
JAA |
3923817 entries |
23:24
🔗
|
JAA |
Easy |
23:25
🔗
|
JAA |
I'll qwarc this. |
23:26
🔗
|
OrIdow6 |
As I'm reading it, each request involves the client's username, hostname, & software name & version |
23:26
🔗
|
JAA |
Yeah, just found that as well. |
23:27
🔗
|
JAA |
arkiver: ^ That might need a patch in the WBM if we want it to be possible to just plug the WBM into tools to continue using the freedb database from there. |
23:35
🔗
|
JAA |
URLs look like this: http://freedb.freedb.org/~cddb/cddb.cgi?cmd=cddb+query+21037703+3+150+21592+47662+889&hello=user+host+application+v0.0&proto=6 http://freedb.freedb.org/~cddb/cddb.cgi?cmd=cddb+read+data+21037703&hello=user+host+application+v0.0&proto=6 |
23:35
🔗
|
JAA |
CDDB documentation available at http://ftp.freedb.org/pub/freedb/latest/CDDBPROTO |
23:36
🔗
|
JAA |
Essentially, the "hello" parameter would have to be ignored in the WBM. |
23:39
🔗
|
|
OrIdow6 has quit IRC (Remote host closed the connection) |
23:45
🔗
|
anarcat |
you guys rock |