Time |
Nickname |
Message |
00:00
🔗
|
|
odemg has quit IRC (Ping timeout: 265 seconds) |
00:03
🔗
|
|
odemg has joined #archiveteam-bs |
00:19
🔗
|
|
enowaldo_ has quit IRC (Read error: Operation timed out) |
00:52
🔗
|
|
achip has quit IRC (Ping timeout: 255 seconds) |
00:57
🔗
|
|
achip has joined #archiveteam-bs |
00:58
🔗
|
|
w00dsman has quit IRC (Leaving) |
01:34
🔗
|
|
HashbangI has quit IRC (Remote host closed the connection) |
01:35
🔗
|
|
enowaldo has joined #archiveteam-bs |
01:43
🔗
|
|
enowaldo has quit IRC (Ping timeout: 492 seconds) |
01:47
🔗
|
|
jeekl has joined #archiveteam-bs |
01:49
🔗
|
|
zino has joined #archiveteam-bs |
01:49
🔗
|
|
HashbangI has joined #archiveteam-bs |
02:08
🔗
|
|
odemg has quit IRC (Ping timeout: 265 seconds) |
02:08
🔗
|
|
xeam has joined #archiveteam-bs |
02:09
🔗
|
|
odemg has joined #archiveteam-bs |
02:12
🔗
|
|
xeam has left |
03:10
🔗
|
|
w00dsman has joined #archiveteam-bs |
03:17
🔗
|
|
qw3rty112 has joined #archiveteam-bs |
03:22
🔗
|
|
qw3rty111 has quit IRC (Read error: Operation timed out) |
03:41
🔗
|
|
odemgi_ has joined #archiveteam-bs |
03:43
🔗
|
|
odemgi has quit IRC (Read error: Operation timed out) |
03:43
🔗
|
|
odemg has quit IRC (Ping timeout: 265 seconds) |
03:43
🔗
|
|
enowaldo has joined #archiveteam-bs |
03:44
🔗
|
|
bobmcjr has joined #archiveteam-bs |
03:46
🔗
|
bobmcjr |
This is probably worth scraping given this notice: https://assemblergames.com/threads/this-forum-to-close-in-30-days.71032/ |
03:48
🔗
|
JAA |
Already in progress. |
03:51
🔗
|
|
enowaldo has quit IRC (Read error: Operation timed out) |
03:52
🔗
|
bobmcjr |
Cool |
03:55
🔗
|
|
odemg has joined #archiveteam-bs |
04:27
🔗
|
bobmcjr |
What am I supposed to do with warcs again? I have a currently dead forum I scraped a few months ago (minus stylesheets and a few icons, sorry). |
04:28
🔗
|
SketchCow |
Upload them to archive.org |
04:29
🔗
|
SketchCow |
They'll go into the warczone collection |
04:29
🔗
|
bobmcjr |
Alright. Webrecorder Player can't see any URLs in this warc for whatever reason. The content is there, and the format appears fine with a quick look in vim. |
04:37
🔗
|
godane |
SketchCow: so finally something interesting in my search for japanese magazines |
04:37
🔗
|
godane |
i found scans someone had put up on mega.nz |
04:37
🔗
|
godane |
so i'm grabbing that |
04:38
🔗
|
godane |
its about 10gb+ from what i can tell |
04:38
🔗
|
|
enowaldo has joined #archiveteam-bs |
04:46
🔗
|
|
Coderjo has quit IRC (Quit: new kernel) |
04:47
🔗
|
|
enowaldo has quit IRC (Ping timeout: 492 seconds) |
04:49
🔗
|
godane |
SketchCow: so i found out Kevin Savetz uploaded 3 ERIC items |
04:51
🔗
|
godane |
i'm going to have go after that id range again cause i have not touch since christmas 2014 : |
04:51
🔗
|
godane |
https://archive.org/details/ERIC_ED284545 |
04:51
🔗
|
godane |
one of savetz files : https://archive.org/details/ERIC_ED284540 |
05:13
🔗
|
godane |
ok then looks like savetz got a copy of that id from somewhere else |
05:14
🔗
|
godane |
cause ED284540 doesn't have a url on page and this url is 404: |
05:14
🔗
|
godane |
https://files.eric.ed.gov/fulltext/ED284540.pdf |
05:41
🔗
|
|
wyatt8740 has joined #archiveteam-bs |
05:46
🔗
|
godane |
checking AT&T Tech Channel and there Family Affair video is block worldwide https://polsy.org.uk/stuff/ytrestrict.cgi?ytid=H7BiihzcxkQ |
05:47
🔗
|
godane |
by MPI Media |
05:48
🔗
|
Flashfire |
hmmmmmmmmmmmm |
05:48
🔗
|
|
bobmcjr has quit IRC (Read error: Operation timed out) |
06:09
🔗
|
|
Zerote has joined #archiveteam-bs |
06:22
🔗
|
|
c4rc4s has quit IRC (Ping timeout: 246 seconds) |
06:22
🔗
|
|
c4rc4s has joined #archiveteam-bs |
06:30
🔗
|
|
fuzzy8021 has quit IRC (Read error: Connection reset by peer) |
06:31
🔗
|
|
fuzzy8021 has joined #archiveteam-bs |
06:34
🔗
|
|
Coderjo has joined #archiveteam-bs |
06:39
🔗
|
|
wyatt8740 has quit IRC (Read error: Operation timed out) |
06:39
🔗
|
|
enowaldo has joined #archiveteam-bs |
06:52
🔗
|
|
enowaldo has quit IRC (Read error: Operation timed out) |
07:16
🔗
|
|
fuzy802 has joined #archiveteam-bs |
07:21
🔗
|
|
Zerote has quit IRC (Ping timeout: 252 seconds) |
07:21
🔗
|
|
fuzzy8021 has quit IRC (Ping timeout: 615 seconds) |
07:26
🔗
|
|
fuzy802 is now known as fuzzy8021 |
07:28
🔗
|
godane |
SketchCow: this may interest you : http://www.queenzone.com/forums/1449503/complete-list-of-documentaries-1979-2018-updated.aspx |
07:28
🔗
|
godane |
tons of queen documentary |
07:29
🔗
|
godane |
now i found this also : https://purplehippies.com/ |
07:40
🔗
|
|
Zerote has joined #archiveteam-bs |
08:00
🔗
|
|
enowaldo has joined #archiveteam-bs |
08:05
🔗
|
|
enowaldo has quit IRC (Ping timeout: 252 seconds) |
08:19
🔗
|
|
m007a83 has quit IRC (Ping timeout: 252 seconds) |
08:50
🔗
|
|
m007a83 has joined #archiveteam-bs |
08:54
🔗
|
|
BlueMax has quit IRC (Read error: Connection reset by peer) |
09:59
🔗
|
|
w00dsman has quit IRC (Remote host closed the connection) |
10:01
🔗
|
|
enowaldo has joined #archiveteam-bs |
10:15
🔗
|
|
enowaldo has quit IRC (Read error: Operation timed out) |
10:34
🔗
|
|
enowaldo has joined #archiveteam-bs |
10:53
🔗
|
|
enowaldo has quit IRC (Read error: Operation timed out) |
11:09
🔗
|
|
wp494 has quit IRC (Ping timeout: 268 seconds) |
11:10
🔗
|
|
wp494 has joined #archiveteam-bs |
11:26
🔗
|
|
enowaldo has joined #archiveteam-bs |
11:29
🔗
|
|
kiskabak has quit IRC (Ping timeout: 265 seconds) |
11:34
🔗
|
|
enowaldo has quit IRC (Ping timeout: 252 seconds) |
11:54
🔗
|
|
enowaldo has joined #archiveteam-bs |
12:43
🔗
|
|
icedice has joined #archiveteam-bs |
13:06
🔗
|
|
terry has joined #archiveteam-bs |
13:08
🔗
|
|
terry is now known as GLaDOS |
13:57
🔗
|
|
deevious has quit IRC (Quit: deevious) |
14:02
🔗
|
|
deevious has joined #archiveteam-bs |
15:05
🔗
|
|
Zerote has quit IRC (Ping timeout: 252 seconds) |
15:25
🔗
|
|
Zerote has joined #archiveteam-bs |
15:59
🔗
|
|
w00dsman has joined #archiveteam-bs |
16:12
🔗
|
|
icedice has quit IRC (Ping timeout: 252 seconds) |
16:15
🔗
|
|
w00dsman has quit IRC (Read error: Operation timed out) |
16:30
🔗
|
|
w00dsman has joined #archiveteam-bs |
16:42
🔗
|
|
anarcat has joined #archiveteam-bs |
16:42
🔗
|
anarcat |
hello window 51 |
16:43
🔗
|
anarcat |
i'll have an estimate of the dataset size of cdn.media.ccc.de in ~8h |
16:46
🔗
|
JAA |
Sweet |
16:47
🔗
|
|
icedice has joined #archiveteam-bs |
16:52
🔗
|
|
Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) |
17:01
🔗
|
Igloo |
anarcat: if my maths are right, ~2 hours. |
17:02
🔗
|
|
astrid has quit IRC (Read error: Operation timed out) |
17:04
🔗
|
|
Somebody2 has quit IRC (Read error: Operation timed out) |
17:04
🔗
|
|
MrRadar_ has quit IRC (Read error: Operation timed out) |
17:04
🔗
|
|
swebb has quit IRC (Read error: Operation timed out) |
17:04
🔗
|
|
yipdw has quit IRC (Read error: Operation timed out) |
17:04
🔗
|
|
me has quit IRC (Read error: Operation timed out) |
17:04
🔗
|
|
phirephl- has quit IRC (Read error: Operation timed out) |
17:04
🔗
|
|
jrwr has quit IRC (Read error: Operation timed out) |
17:05
🔗
|
|
superkuh has quit IRC (Read error: Operation timed out) |
17:05
🔗
|
|
erin has quit IRC (Write error: Broken pipe) |
17:05
🔗
|
|
balrog_ has joined #archiveteam-bs |
17:05
🔗
|
|
chazchaz_ has quit IRC (Read error: Operation timed out) |
17:06
🔗
|
|
RichardG has quit IRC (Ping timeout: 360 seconds) |
17:06
🔗
|
|
RichardG has joined #archiveteam-bs |
17:06
🔗
|
|
swebb has joined #archiveteam-bs |
17:06
🔗
|
|
zino has quit IRC (Ping timeout: 360 seconds) |
17:07
🔗
|
|
Pixi` has joined #archiveteam-bs |
17:07
🔗
|
|
Pixi has quit IRC (Read error: Operation timed out) |
17:07
🔗
|
|
Darkstar has quit IRC (Read error: Operation timed out) |
17:07
🔗
|
|
chfoo has quit IRC (Ping timeout: 360 seconds) |
17:08
🔗
|
|
nightpool has quit IRC (Ping timeout: 360 seconds) |
17:08
🔗
|
|
Fionera_ has joined #archiveteam-bs |
17:09
🔗
|
|
unlobito has quit IRC (Read error: Operation timed out) |
17:09
🔗
|
|
phirephly has joined #archiveteam-bs |
17:09
🔗
|
|
superkuh has joined #archiveteam-bs |
17:09
🔗
|
|
unlobito has joined #archiveteam-bs |
17:09
🔗
|
|
godane has quit IRC (Ping timeout: 360 seconds) |
17:09
🔗
|
|
twigfoot has quit IRC (Ping timeout: 360 seconds) |
17:09
🔗
|
|
Darkstar has joined #archiveteam-bs |
17:10
🔗
|
Fusl |
anarcat: cdn.media.ccc.de for? |
17:10
🔗
|
|
GLaDOS has quit IRC (Read error: Operation timed out) |
17:10
🔗
|
Fusl |
my mirror password might still work |
17:10
🔗
|
Fusl |
can just rsync everything off of it |
17:11
🔗
|
Fusl |
i think around 1.5tb is what it was last time i had my mirror up |
17:11
🔗
|
|
chfoo has joined #archiveteam-bs |
17:12
🔗
|
Fusl |
yup, my password is still active |
17:13
🔗
|
|
schbirid has quit IRC (Read error: Operation timed out) |
17:13
🔗
|
|
nightpool has joined #archiveteam-bs |
17:14
🔗
|
|
balrog has quit IRC (Read error: Operation timed out) |
17:14
🔗
|
|
balrog_ is now known as balrog |
17:17
🔗
|
|
chirlu` has quit IRC (Read error: Operation timed out) |
17:17
🔗
|
|
Fionera has quit IRC (Read error: Operation timed out) |
17:17
🔗
|
|
twigfoot has joined #archiveteam-bs |
17:20
🔗
|
|
zino has joined #archiveteam-bs |
17:21
🔗
|
|
godane has joined #archiveteam-bs |
17:21
🔗
|
Igloo |
If that's all it is we can just chop it up into a bunch of -ao jobs |
17:21
🔗
|
Igloo |
and fire them at AB |
17:22
🔗
|
JAA |
Please no. |
17:22
🔗
|
Igloo |
https://pastebin.com/UGyKqJdR |
17:22
🔗
|
Igloo |
Anyone seen this kind of SSL error before? |
17:23
🔗
|
Igloo |
WARNING ImportError: /tmp/_MEIK5xSzV/libssl.so.1.0.0: version `OPENSSL_1.0.2' not found (required by /usr/lib/python3.5/lib-dynload/_ssl.cpython-35m-x86_64-linux-gnu.so) |
17:23
🔗
|
|
schbirid has joined #archiveteam-bs |
17:23
🔗
|
Igloo |
Ubuntu 16.04 - wpull 2.0.1 & 1.2.3 - youtube-dl is what throws it |
17:23
🔗
|
JAA |
"Your paste has triggered our automatic SPAM detection filter." |
17:23
🔗
|
Igloo |
Fixed. |
17:23
🔗
|
|
bobmcjr has joined #archiveteam-bs |
17:25
🔗
|
JAA |
Fusl: So the thing is, an rsync mirror or similar would definitely be great if we want it as IA items. For the WBM though, we need to retrieve it over HTTP. |
17:25
🔗
|
JAA |
So yeah, the question is where we want to put it and whether we want links in the WBM to work. |
17:25
🔗
|
Fusl |
total size is 8,453,818,507,398 |
17:25
🔗
|
Fusl |
8.5tb |
17:25
🔗
|
JAA |
cdn.media.ccc.de URLs came up repeatedly in AB jobs, so there's clearly a good number of links out there. |
17:26
🔗
|
JAA |
anarcat: ^ I guess you can stop your script. |
17:26
🔗
|
Fusl |
also, cdn urls cant be really grabbed |
17:26
🔗
|
Fusl |
the cdn itself is a redirector to other domains |
17:27
🔗
|
Fusl |
https://cdn.media.ccc.de/congress/2016/webm-hd/33c3-8429-eng-deu-fra-33C3_Opening_Ceremony_webm-hd.webm?mirrorlist |
17:27
🔗
|
Fusl |
check this |
17:27
🔗
|
|
chirlu has joined #archiveteam-bs |
17:27
🔗
|
Fusl |
it does a 302 redirect: Location: https://ftp.halifax.rwth-aachen.de/ccc/congress/2016/webm-hd/33c3-8429-eng-deu-fra-33C3_Opening_Ceremony_webm-hd.webm |
17:27
🔗
|
JAA |
That's fine. |
17:27
🔗
|
Fusl |
ic |
17:27
🔗
|
JAA |
We'd just grab both the redirect and whatever it points to. |
17:28
🔗
|
JAA |
Where that's stored exactly doesn't matter for the WBM. |
17:28
🔗
|
JAA |
The CDN link would still work. |
17:28
🔗
|
Fusl |
well here's the full rsync file list: http://xor.meo.ws/TRBVQ6SkRNtSnqDJC6nyKeOrr6Uo1bAP/ccc.txt |
17:29
🔗
|
Fusl |
and here: https://cdn.media.ccc.de/INDEX |
17:35
🔗
|
|
erin has joined #archiveteam-bs |
17:36
🔗
|
|
me has joined #archiveteam-bs |
17:37
🔗
|
|
jrwr has joined #archiveteam-bs |
17:37
🔗
|
|
Fusl sets mode: +o jrwr |
17:39
🔗
|
|
astrid has joined #archiveteam-bs |
17:39
🔗
|
|
Fusl sets mode: +o astrid |
17:40
🔗
|
|
MrRadar has joined #archiveteam-bs |
17:41
🔗
|
|
chazchaz has joined #archiveteam-bs |
17:41
🔗
|
|
Somebody2 has joined #archiveteam-bs |
17:41
🔗
|
|
svchfoo1 sets mode: +o Somebody2 |
17:41
🔗
|
|
svchfoo3 sets mode: +o Somebody2 |
17:46
🔗
|
SketchCow |
Hi Jason. I found out today that you blocked my bot (@shwayest) which allows people to tweet anonymously from IRC. I am not going to try and convince you to unblock it or anything like that as I respect your decision, however I'm just curious as to what caused you to block it? I am busy adding more features and like to gather data so I can improve existing ones, such as the anti-abuse stuff. |
17:46
🔗
|
SketchCow |
The fuck is this |
17:47
🔗
|
JAA |
https://twitter.com/shwayest ? |
17:47
🔗
|
Fusl |
ayeah |
17:48
🔗
|
JAA |
The tweets there make me want to block it as well, and I don't even use Twitter. |
17:48
🔗
|
godane |
is IA choking on its search index or something |
17:48
🔗
|
Fusl |
"what caused you to block it" - because what you're doing is a bad idea? |
17:48
🔗
|
godane |
search is not working and vhsvault is empty |
17:49
🔗
|
Fusl |
godane: archive seems b0rked right now, website was fully down a few minutes ago |
17:49
🔗
|
JAA |
https://twitter.com/search?q=from%3Ashwayest%20to%3Atextfiles&src=typd |
17:49
🔗
|
SketchCow |
OK, I see now |
17:49
🔗
|
godane |
ok |
17:50
🔗
|
SketchCow |
When he said "from IRC" I assumed he meant from here |
17:50
🔗
|
SketchCow |
But he means probably some ridiculous channel somewhere |
17:50
🔗
|
JAA |
Yeah |
17:50
🔗
|
SketchCow |
And he was tweeting at me to tell me about ASSembler |
17:50
🔗
|
SketchCow |
And that's how he found out I was blocked |
17:50
🔗
|
JAA |
http://www.megachan.net/proxy-tweets/ |
17:50
🔗
|
|
yipdw has joined #archiveteam-bs |
17:51
🔗
|
Fusl |
"I fully expect the account to be banned soon due to shitposts (shitweets?) from the IRC users" |
17:51
🔗
|
SketchCow |
I like that he both acknowledges that it can't ever not be a vector for abuse, but also is saaaaaaaaaaaaaaad I blocked that shit |
17:51
🔗
|
Fusl |
you are expecting the worst, but you are complaining about the best outcome you had so far? |
17:51
🔗
|
Fusl |
SketchCow: yeah |
17:51
🔗
|
Fusl |
:D |
17:51
🔗
|
SketchCow |
Anyway, sorry to distract, where am I |
17:51
🔗
|
Fusl |
literally just my words |
17:51
🔗
|
Fusl |
hahaha |
17:57
🔗
|
godane |
looks like FOS maybe down too |
17:58
🔗
|
kiska |
Yep, ok wasn't my vm not connecting to rsync then |
18:00
🔗
|
Fusl |
Error: Error: connect EHOSTUNREACH 208.70.31.102:21 |
18:00
🔗
|
Fusl |
Thu May 30 2019 19:43:08 GMT+0200 (Central European Summer Time) |
18:00
🔗
|
Fusl |
yeah |
18:19
🔗
|
SketchCow |
The entire datacenter is down. |
18:19
🔗
|
SketchCow |
Fiber upgrade |
18:20
🔗
|
SketchCow |
Was supposed to be an hour, but it's expanding, of course |
18:20
🔗
|
Fusl |
lol nice |
18:20
🔗
|
Fusl |
feels more like a downgrade if you ask me :P |
18:37
🔗
|
|
GLaDOS has joined #archiveteam-bs |
18:46
🔗
|
SketchCow |
FOS is back. |
19:13
🔗
|
|
killsushi has joined #archiveteam-bs |
19:22
🔗
|
|
Despatche has quit IRC (Read error: Operation timed out) |
19:37
🔗
|
|
icedice2 has joined #archiveteam-bs |
19:40
🔗
|
|
icedice has quit IRC (Ping timeout: 252 seconds) |
19:47
🔗
|
|
icedice2 has quit IRC (Quit: Leaving) |
19:48
🔗
|
|
icedice has joined #archiveteam-bs |
20:10
🔗
|
|
Despatche has joined #archiveteam-bs |
20:25
🔗
|
godane |
SketchCow: i'm starting to upload some vhs tape rips i have done 2 weeks ago |
20:26
🔗
|
godane |
these are Readers Digest tapes on Grand Canyon, Yellowstone, and Yosemite from 1988 |
20:30
🔗
|
|
thuban4 has joined #archiveteam-bs |
20:32
🔗
|
|
w00dsman has quit IRC (Leaving) |
20:32
🔗
|
|
enowaldo has quit IRC (Ping timeout: 268 seconds) |
20:51
🔗
|
|
thuban has joined #archiveteam-bs |
20:52
🔗
|
|
thuban4 has quit IRC (Read error: Operation timed out) |
21:03
🔗
|
|
lindalap has joined #archiveteam-bs |
21:10
🔗
|
|
lindalap has quit IRC (Quit: lindalap) |
21:43
🔗
|
|
enowaldo has joined #archiveteam-bs |
21:45
🔗
|
|
BlueMax has joined #archiveteam-bs |
21:48
🔗
|
|
enowaldo has quit IRC (Ping timeout: 268 seconds) |
22:09
🔗
|
icedice |
godane: Is there any recommended VHS ripping kit btw or do they all produce about the same quality? |
22:10
🔗
|
godane |
i'm using a usb easycap |
22:11
🔗
|
godane |
it was my cheap solution to capture my home recordings |
22:11
🔗
|
godane |
then captures everything Jason sents me |
22:23
🔗
|
|
DigiDigi has joined #archiveteam-bs |
22:34
🔗
|
icedice |
Ok |
22:37
🔗
|
|
phiresky has quit IRC (Quit: The Lounge - https://thelounge.chat) |
22:38
🔗
|
|
Atom has joined #archiveteam-bs |
22:38
🔗
|
|
BlueMax has quit IRC (Quit: Leaving) |
23:03
🔗
|
|
Zerote has quit IRC (Ping timeout: 252 seconds) |
23:10
🔗
|
|
Hani has quit IRC (Ping timeout: 615 seconds) |
23:10
🔗
|
|
Hani has joined #archiveteam-bs |
23:27
🔗
|
|
exoire has joined #archiveteam-bs |
23:31
🔗
|
anarcat |
JAA: thanks for the heads up, stopped |
23:31
🔗
|
anarcat |
hum |
23:32
🔗
|
anarcat |
it was finished, oddly |
23:32
🔗
|
anarcat |
$ wc -l all-lengths |
23:32
🔗
|
anarcat |
6727 all-lengths |
23:32
🔗
|
anarcat |
$ awk '{ total += $2 } END { print total }' < all-lengths |
23:32
🔗
|
anarcat |
918453455252 |
23:32
🔗
|
JAA |
Maybe the server doesn't advertise the length for some (most?) URLs? |
23:32
🔗
|
anarcat |
that says 918GB |
23:32
🔗
|
anarcat |
but i would trust the other numbers we had before here better |
23:33
🔗
|
anarcat |
anyways |
23:52
🔗
|
|
godane has quit IRC (Ping timeout: 246 seconds) |