Time |
Nickname |
Message |
00:07
🔗
|
|
netsound has quit IRC (Remote host closed the connection) |
00:11
🔗
|
|
enowaldo has joined #archiveteam-bs |
00:16
🔗
|
|
enowaldo has quit IRC (Read error: Operation timed out) |
00:17
🔗
|
|
Frogging has joined #archiveteam-bs |
00:20
🔗
|
|
JSharp has quit IRC (Read error: Connection reset by peer) |
00:28
🔗
|
|
revi has quit IRC (Read error: Connection reset by peer) |
00:33
🔗
|
|
killsushi has quit IRC (Quit: Leaving) |
00:42
🔗
|
|
revi has joined #archiveteam-bs |
00:42
🔗
|
|
JSharp has joined #archiveteam-bs |
00:42
🔗
|
|
omarroth has quit IRC (Ping timeout: 506 seconds) |
00:43
🔗
|
|
omarroth has joined #archiveteam-bs |
00:49
🔗
|
|
omarroth has quit IRC (Quit: Konversation terminated!) |
01:14
🔗
|
|
killsushi has joined #archiveteam-bs |
01:18
🔗
|
|
BlueMax has quit IRC (Quit: Leaving) |
01:59
🔗
|
|
Despatche has quit IRC (Quit: Read error: Connection reset by deer) |
02:12
🔗
|
|
BlueMax has joined #archiveteam-bs |
02:33
🔗
|
|
enowaldo has joined #archiveteam-bs |
02:38
🔗
|
|
Zerote has quit IRC (Ping timeout: 260 seconds) |
02:39
🔗
|
|
enowaldo has quit IRC (Ping timeout: 265 seconds) |
03:08
🔗
|
|
Mateon1 has quit IRC (Remote host closed the connection) |
03:08
🔗
|
|
Mateon1 has joined #archiveteam-bs |
03:11
🔗
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |
03:20
🔗
|
|
wyatt8740 has joined #archiveteam-bs |
03:28
🔗
|
|
qw3rty117 has joined #archiveteam-bs |
03:30
🔗
|
|
killsushi has quit IRC (Quit: Leaving) |
03:30
🔗
|
|
wyatt8740 has quit IRC (Read error: Connection reset by peer) |
03:31
🔗
|
|
wyatt8740 has joined #archiveteam-bs |
03:34
🔗
|
|
odemgi_ has joined #archiveteam-bs |
03:35
🔗
|
|
qw3rty116 has quit IRC (Read error: Operation timed out) |
03:36
🔗
|
|
odemgi has quit IRC (Read error: Operation timed out) |
03:43
🔗
|
|
odemg has quit IRC (Ping timeout: 615 seconds) |
03:50
🔗
|
|
odemg has joined #archiveteam-bs |
04:24
🔗
|
|
tech234a has joined #archiveteam-bs |
04:35
🔗
|
|
enowaldo has joined #archiveteam-bs |
04:44
🔗
|
|
enowaldo has quit IRC (Read error: Operation timed out) |
05:14
🔗
|
|
enowaldo has joined #archiveteam-bs |
05:19
🔗
|
|
enowaldo has quit IRC (Ping timeout: 265 seconds) |
05:29
🔗
|
|
Exairnous has quit IRC (Ping timeout: 615 seconds) |
05:50
🔗
|
|
ndiddy has quit IRC () |
05:55
🔗
|
|
Odd0002_ has joined #archiveteam-bs |
06:00
🔗
|
|
Odd0002 has quit IRC (Ping timeout: 615 seconds) |
06:00
🔗
|
|
Odd0002_ is now known as Odd0002 |
06:23
🔗
|
|
Dimtree has quit IRC () |
06:34
🔗
|
|
tech234a has quit IRC (Quit: Connection closed for inactivity) |
06:36
🔗
|
|
Dimtree has joined #archiveteam-bs |
07:08
🔗
|
|
wabu has quit IRC (Read error: Operation timed out) |
07:10
🔗
|
|
wabu has joined #archiveteam-bs |
07:29
🔗
|
|
purplebot has quit IRC (Read error: Operation timed out) |
07:29
🔗
|
|
sknebel has quit IRC (Read error: Connection reset by peer) |
07:29
🔗
|
|
purplebot has joined #archiveteam-bs |
07:30
🔗
|
|
Ing3b0rg has quit IRC (Ping timeout: 506 seconds) |
07:30
🔗
|
|
sknebel has joined #archiveteam-bs |
07:30
🔗
|
|
benjins has joined #archiveteam-bs |
07:32
🔗
|
|
Ing3b0rg has joined #archiveteam-bs |
07:32
🔗
|
|
colona_ has joined #archiveteam-bs |
07:32
🔗
|
|
benjinss has quit IRC (Ping timeout: 506 seconds) |
07:33
🔗
|
|
wp494 has quit IRC (Ping timeout: 506 seconds) |
07:35
🔗
|
|
wp494 has joined #archiveteam-bs |
07:35
🔗
|
|
Fusl has quit IRC (Ping timeout: 740 seconds) |
07:35
🔗
|
|
fratti has joined #archiveteam-bs |
07:36
🔗
|
|
Coderjo_ has joined #archiveteam-bs |
07:36
🔗
|
|
Coderjo has quit IRC (Read error: Connection reset by peer) |
07:36
🔗
|
|
Fusl has joined #archiveteam-bs |
07:36
🔗
|
|
fratti_ has quit IRC (Ping timeout: 506 seconds) |
07:37
🔗
|
|
svchfoo1 sets mode: +o Fusl |
07:37
🔗
|
|
svchfoo3 sets mode: +o Fusl |
07:38
🔗
|
|
purplebot has quit IRC (Ping timeout: 506 seconds) |
07:38
🔗
|
|
purplebot has joined #archiveteam-bs |
07:38
🔗
|
|
colona has quit IRC (Ping timeout: 740 seconds) |
07:56
🔗
|
|
purplebot has quit IRC (Read error: Connection reset by peer) |
07:56
🔗
|
|
purplebot has joined #archiveteam-bs |
08:01
🔗
|
|
closure has quit IRC (Read error: Connection reset by peer) |
08:01
🔗
|
|
closure has joined #archiveteam-bs |
08:04
🔗
|
|
C4K3 has quit IRC (Ping timeout: 506 seconds) |
08:08
🔗
|
|
C4K3 has joined #archiveteam-bs |
08:12
🔗
|
|
eythian has quit IRC (Ping timeout: 740 seconds) |
08:19
🔗
|
|
eythian has joined #archiveteam-bs |
08:23
🔗
|
|
ugh has joined #archiveteam-bs |
08:23
🔗
|
|
legoktm has quit IRC (Read error: Connection reset by peer) |
08:23
🔗
|
|
sknebel has quit IRC (Read error: Connection reset by peer) |
08:25
🔗
|
|
sknebel has joined #archiveteam-bs |
08:43
🔗
|
|
Despatche has joined #archiveteam-bs |
09:34
🔗
|
JAA |
I reran my Sola API URL extraction with a better regex (yay for "parsing" JSON with regex!) and got a few 100k additional URLs. 8.73 million now. I'll start grabbing those shortly. I'll grab the Sola CDN stuff on my machine and throw everything external into ArchiveBot. |
09:34
🔗
|
JAA |
This list includes all URLs that appear within posts, i.e. outlinks from Sola. |
09:54
🔗
|
|
noirscape has joined #archiveteam-bs |
09:55
🔗
|
|
JAA has quit IRC (Read error: Operation timed out) |
09:56
🔗
|
|
Frogging has quit IRC (Read error: Operation timed out) |
09:56
🔗
|
|
cfarquhar has quit IRC (Read error: Operation timed out) |
09:56
🔗
|
|
Frogging has joined #archiveteam-bs |
09:56
🔗
|
|
wabu has quit IRC (Read error: Operation timed out) |
09:57
🔗
|
|
svchfoo1 has quit IRC (Read error: Operation timed out) |
09:58
🔗
|
|
c4rc4s has quit IRC (Read error: Operation timed out) |
09:58
🔗
|
|
simon816 has quit IRC (Ping timeout: 246 seconds) |
09:58
🔗
|
|
eythian has quit IRC (Read error: Operation timed out) |
10:00
🔗
|
|
wp494 has quit IRC (Ping timeout: 493 seconds) |
10:01
🔗
|
|
Mateon1 has quit IRC (Ping timeout: 493 seconds) |
10:01
🔗
|
|
Mateon1 has joined #archiveteam-bs |
10:02
🔗
|
|
wp494 has joined #archiveteam-bs |
10:18
🔗
|
|
eythian has joined #archiveteam-bs |
10:34
🔗
|
|
wacky has quit IRC (Ping timeout: 615 seconds) |
10:45
🔗
|
|
Verified_ has quit IRC (Ping timeout: 252 seconds) |
10:52
🔗
|
|
Hani111 has joined #archiveteam-bs |
10:57
🔗
|
|
svchfoo1 has joined #archiveteam-bs |
10:57
🔗
|
|
Fusl sets mode: +o svchfoo1 |
10:57
🔗
|
|
simon816 has joined #archiveteam-bs |
10:57
🔗
|
|
c4rc4s has joined #archiveteam-bs |
10:57
🔗
|
|
wacky has joined #archiveteam-bs |
10:58
🔗
|
|
cfarquhar has joined #archiveteam-bs |
10:59
🔗
|
|
Verified_ has joined #archiveteam-bs |
11:00
🔗
|
|
JAA has joined #archiveteam-bs |
11:00
🔗
|
|
Fusl sets mode: +o JAA |
11:00
🔗
|
|
bakJAA sets mode: +o JAA |
11:01
🔗
|
|
Hani has quit IRC (Ping timeout: 615 seconds) |
11:01
🔗
|
|
Hani111 is now known as Hani |
11:01
🔗
|
|
wabu has joined #archiveteam-bs |
11:15
🔗
|
|
enowaldo has joined #archiveteam-bs |
11:16
🔗
|
|
Stiletto has quit IRC (Ping timeout: 265 seconds) |
11:28
🔗
|
|
enowaldo has quit IRC (Read error: Operation timed out) |
11:43
🔗
|
|
Sian1468 has joined #archiveteam-bs |
11:45
🔗
|
schbirid |
JAA: i've been archiving http://srv.deutschlandradio.de/aodlistaudio.1706.de.rpc for about two years now, just shout if you want the files (via rclone, i have them in gc) |
11:46
🔗
|
|
Sian1468 has quit IRC (Client Quit) |
11:51
🔗
|
JAA |
schbirid: Come to #radio-archive and talk to arkiver! |
12:12
🔗
|
schbirid |
hm, that seems more like live radiu |
12:12
🔗
|
schbirid |
what i archive is more "mediatheken" (yay germany) |
12:12
🔗
|
schbirid |
on-demand stuff |
12:16
🔗
|
JAA |
Ah, sorry, just read Deutschland Radio and "archiving for two years" and assumed it was the live stream without checking. |
12:16
🔗
|
JAA |
Any idea about the size? |
12:18
🔗
|
schbirid |
true, that was misleading |
12:18
🔗
|
schbirid |
let me have a look |
12:23
🔗
|
|
icedice has joined #archiveteam-bs |
12:28
🔗
|
schbirid |
this will take a while =) |
12:37
🔗
|
|
BlueMax has quit IRC (Quit: Leaving) |
12:44
🔗
|
JAA |
I've started my Sola CDN retrieval. 6.52 million URLS on cdn.solacore.net and s3.amazonaws.com. |
12:46
🔗
|
JAA |
load average: 8.46, 8.51, 5.63 |
12:47
🔗
|
JAA |
Time for this machine to do some work. |
12:52
🔗
|
|
omarroth has joined #archiveteam-bs |
12:58
🔗
|
|
bitBaron has joined #archiveteam-bs |
13:09
🔗
|
|
Hani111 has joined #archiveteam-bs |
13:10
🔗
|
|
Hani has quit IRC (Read error: Operation timed out) |
13:14
🔗
|
|
Hani111 has quit IRC (Ping timeout: 255 seconds) |
13:14
🔗
|
|
Hani has joined #archiveteam-bs |
13:29
🔗
|
|
omarroth has quit IRC (Remote host closed the connection) |
13:31
🔗
|
|
omarroth has joined #archiveteam-bs |
13:32
🔗
|
|
enowaldo has joined #archiveteam-bs |
13:49
🔗
|
|
Zerote has joined #archiveteam-bs |
13:53
🔗
|
|
overflowe has quit IRC (Remote host closed the connection) |
13:54
🔗
|
|
enowaldo has quit IRC (Ping timeout: 265 seconds) |
14:07
🔗
|
|
omarroth has quit IRC (Ping timeout: 246 seconds) |
14:08
🔗
|
|
omarroth has joined #archiveteam-bs |
14:10
🔗
|
|
bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) |
14:17
🔗
|
|
bitBaron has joined #archiveteam-bs |
14:32
🔗
|
|
Sian1468 has joined #archiveteam-bs |
14:37
🔗
|
|
Sian1468 has quit IRC (Client Quit) |
14:38
🔗
|
|
argus has joined #archiveteam-bs |
15:18
🔗
|
|
Hani111 has joined #archiveteam-bs |
15:19
🔗
|
|
Hani111_ has joined #archiveteam-bs |
15:21
🔗
|
|
Hani111__ has joined #archiveteam-bs |
15:23
🔗
|
|
bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) |
15:23
🔗
|
|
omarroth has quit IRC (Read error: Connection reset by peer) |
15:25
🔗
|
JAA |
Sola CDN: 3 hours, about 10 % done, so I expect it to finish tomorrow afternoon. |
15:27
🔗
|
|
Hani has quit IRC (Ping timeout: 615 seconds) |
15:27
🔗
|
|
Hani111__ is now known as Hani |
15:29
🔗
|
|
Hani111 has quit IRC (Ping timeout: 615 seconds) |
15:31
🔗
|
|
Hani111_ has quit IRC (Ping timeout: 615 seconds) |
15:39
🔗
|
|
Rome_Silv has joined #archiveteam-bs |
15:47
🔗
|
|
VADemon has joined #archiveteam-bs |
15:50
🔗
|
VADemon |
ivan, I've just received a notice that a YT channel has been geoblocked in DE (politics) can you please archive it? link: |
15:50
🔗
|
VADemon |
https://www.youtube.com/channel/UC4teOyQNXzMH94YMYySl9mA |
15:53
🔗
|
ivan |
grabbing |
15:55
🔗
|
|
icedice has quit IRC (Quit: Leaving) |
16:37
🔗
|
|
VerifiedJ has joined #archiveteam-bs |
17:02
🔗
|
|
VerifiedJ has quit IRC (Read error: Connection reset by peer) |
17:02
🔗
|
|
VerifiedJ has joined #archiveteam-bs |
17:06
🔗
|
|
VADemon has quit IRC (Quit: Bye) |
17:06
🔗
|
|
VADemon has joined #archiveteam-bs |
17:08
🔗
|
|
VerifiedJ has quit IRC (Ping timeout: 252 seconds) |
17:08
🔗
|
JAA |
How did I not know about this? http://webarchive.loc.gov/ |
17:09
🔗
|
|
VerifiedJ has joined #archiveteam-bs |
17:12
🔗
|
ivan |
I guess you knew https://www.webarchive.org.uk/ |
17:12
🔗
|
|
VerifiedJ has quit IRC (Read error: Connection reset by peer) |
17:14
🔗
|
JAA |
Yeah |
17:14
🔗
|
|
enowaldo has joined #archiveteam-bs |
17:15
🔗
|
|
godane has quit IRC (Quit: Leaving.) |
17:19
🔗
|
|
enowaldo has quit IRC (Ping timeout: 268 seconds) |
17:20
🔗
|
|
bitBaron has joined #archiveteam-bs |
17:25
🔗
|
|
godane has joined #archiveteam-bs |
17:36
🔗
|
|
bitBaron has quit IRC (My computer has gone to sleep. 😴😪ZZZzzz…) |
17:37
🔗
|
|
bitBaron has joined #archiveteam-bs |
18:02
🔗
|
schbirid |
JAA: ~2TB |
18:02
🔗
|
schbirid |
~50G per month |
18:06
🔗
|
|
enowaldo has joined #archiveteam-bs |
18:07
🔗
|
|
ndiddy has joined #archiveteam-bs |
18:11
🔗
|
|
Verified_ has quit IRC (Ping timeout: 252 seconds) |
18:16
🔗
|
|
Verified_ has joined #archiveteam-bs |
18:24
🔗
|
|
Verified_ has quit IRC (Ping timeout: 252 seconds) |
18:36
🔗
|
|
Verified_ has joined #archiveteam-bs |
18:45
🔗
|
|
killsushi has joined #archiveteam-bs |
18:50
🔗
|
JAA |
schbirid: WARCs? |
18:59
🔗
|
|
bitBaron has quit IRC (My computer has gone to sleep. 😴😪ZZZzzz…) |
19:38
🔗
|
JAA |
wpull seg faulting, always a pleasure. |
19:44
🔗
|
|
RomeSilva has joined #archiveteam-bs |
19:46
🔗
|
|
Rome_Silv has quit IRC (Ping timeout: 252 seconds) |
19:51
🔗
|
|
cfarquhar has quit IRC (Read error: Operation timed out) |
19:56
🔗
|
|
cfarquhar has joined #archiveteam-bs |
19:57
🔗
|
|
enowaldo has quit IRC (Read error: Operation timed out) |
19:58
🔗
|
schbirid |
JAA: hell no, just the media files :) |
19:59
🔗
|
JAA |
Aw ;-) |
19:59
🔗
|
JAA |
Did you keep metadata as well? |
20:12
🔗
|
schbirid |
nope :D |
20:12
🔗
|
JAA |
:-( |
20:14
🔗
|
JAA |
Is there a way to get the relevant metadata now? |
20:16
🔗
|
schbirid |
dont think so |
20:16
🔗
|
schbirid |
now i feel bad :D |
20:16
🔗
|
schbirid |
i should simply zip up the feed pages daily |
20:16
🔗
|
schbirid |
the files are nicely named and are in a y/m/d structure on the originating server though so it is pretty good |
20:26
🔗
|
JAA |
Looks like it should be possible to collect the information. |
20:26
🔗
|
JAA |
E.g. https://srv.deutschlandradio.de/aodlistaudio.1706.de.rpc?drau:station_id=4&drau:from=01.10.2018&drau:to=01.10.2018&drau:page=1&drau:limit=30 |
20:29
🔗
|
JAA |
Or rather something like https://srv.deutschlandradio.de/aodlistaudio.1706.de.rpc?drau:page=1&drau:limit=30 |
20:30
🔗
|
JAA |
995 pages with limit=100. |
20:32
🔗
|
|
enowaldo has joined #archiveteam-bs |
20:41
🔗
|
|
enowaldo has quit IRC (Ping timeout: 492 seconds) |
21:05
🔗
|
|
bitBaron has joined #archiveteam-bs |
21:25
🔗
|
|
Rome has joined #archiveteam-bs |
21:31
🔗
|
|
RomeSilva has quit IRC (Read error: Operation timed out) |
21:41
🔗
|
|
wabu has quit IRC (Read error: Operation timed out) |
21:42
🔗
|
|
cfarquhar has quit IRC (Read error: Operation timed out) |
21:42
🔗
|
|
Exairnous has joined #archiveteam-bs |
21:42
🔗
|
|
svchfoo1 has quit IRC (Read error: Operation timed out) |
21:44
🔗
|
|
wm_ has joined #archiveteam-bs |
21:44
🔗
|
|
simon816 has quit IRC (Ping timeout: 246 seconds) |
21:44
🔗
|
|
c4rc4s has quit IRC (Read error: Operation timed out) |
21:45
🔗
|
|
JAA has quit IRC (Ping timeout: 246 seconds) |
21:45
🔗
|
|
Rome has quit IRC (Remote host closed the connection) |
21:45
🔗
|
|
Rome has joined #archiveteam-bs |
22:15
🔗
|
|
tech234a has joined #archiveteam-bs |
22:43
🔗
|
|
simon816 has joined #archiveteam-bs |
22:43
🔗
|
|
c4rc4s has joined #archiveteam-bs |
22:43
🔗
|
|
svchfoo1 has joined #archiveteam-bs |
22:43
🔗
|
|
Fusl sets mode: +o svchfoo1 |
22:43
🔗
|
|
cfarquhar has joined #archiveteam-bs |
22:44
🔗
|
|
JAA has joined #archiveteam-bs |
22:44
🔗
|
|
Fusl sets mode: +o JAA |
22:44
🔗
|
|
bakJAA sets mode: +o JAA |
22:46
🔗
|
|
wabu has joined #archiveteam-bs |
23:26
🔗
|
|
ndiddy has quit IRC (Ping timeout: 252 seconds) |
23:28
🔗
|
|
BlueMax has joined #archiveteam-bs |
23:34
🔗
|
|
Damme_ has joined #archiveteam-bs |
23:36
🔗
|
|
Atom-- has joined #archiveteam-bs |
23:36
🔗
|
|
benjinsmi has joined #archiveteam-bs |
23:36
🔗
|
|
dxrt- has joined #archiveteam-bs |
23:36
🔗
|
|
ats_ has joined #archiveteam-bs |
23:39
🔗
|
|
sep332_ has joined #archiveteam-bs |
23:41
🔗
|
|
bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) |
23:41
🔗
|
|
zino has quit IRC (Quit: Leaving) |
23:41
🔗
|
|
zino has joined #archiveteam-bs |
23:42
🔗
|
|
svchfoo1 sets mode: +o zino |
23:42
🔗
|
|
svchfoo3 sets mode: +o zino |
23:42
🔗
|
|
joepie91_ has joined #archiveteam-bs |
23:43
🔗
|
|
Rome has quit IRC (Read error: Connection reset by peer) |
23:44
🔗
|
|
Rome has joined #archiveteam-bs |
23:44
🔗
|
|
Verified_ has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
C4K3 has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
benjins has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
Damme has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
joepie91 has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
ats has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
sep332 has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
Atom__ has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
dxrt has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
jut has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
coderobe has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
ColdIce has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
kiska has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
ranma has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
Flashfire has quit IRC (se.hub irc.underworld.no) |
23:44
🔗
|
|
i0npulse has quit IRC (se.hub irc.underworld.no) |
23:48
🔗
|
|
C4K3_ has joined #archiveteam-bs |
23:50
🔗
|
JAA |
Sola CDN: 11 hours, 38 % done, still on track for ~30 hour total |
23:55
🔗
|
|
enowaldo has joined #archiveteam-bs |
23:56
🔗
|
JAA |
Or not since my disks are filling up. |
23:56
🔗
|
JAA |
400 GiB already... |