#archiveteam-bs 2019-05-30,Thu

↑back Search

Time Nickname Message
00:00 🔗 odemg has quit IRC (Ping timeout: 265 seconds)
00:03 🔗 odemg has joined #archiveteam-bs
00:19 🔗 enowaldo_ has quit IRC (Read error: Operation timed out)
00:52 🔗 achip has quit IRC (Ping timeout: 255 seconds)
00:57 🔗 achip has joined #archiveteam-bs
00:58 🔗 w00dsman has quit IRC (Leaving)
01:34 🔗 HashbangI has quit IRC (Remote host closed the connection)
01:35 🔗 enowaldo has joined #archiveteam-bs
01:43 🔗 enowaldo has quit IRC (Ping timeout: 492 seconds)
01:47 🔗 jeekl has joined #archiveteam-bs
01:49 🔗 zino has joined #archiveteam-bs
01:49 🔗 HashbangI has joined #archiveteam-bs
02:08 🔗 odemg has quit IRC (Ping timeout: 265 seconds)
02:08 🔗 xeam has joined #archiveteam-bs
02:09 🔗 odemg has joined #archiveteam-bs
02:12 🔗 xeam has left
03:10 🔗 w00dsman has joined #archiveteam-bs
03:17 🔗 qw3rty112 has joined #archiveteam-bs
03:22 🔗 qw3rty111 has quit IRC (Read error: Operation timed out)
03:41 🔗 odemgi_ has joined #archiveteam-bs
03:43 🔗 odemgi has quit IRC (Read error: Operation timed out)
03:43 🔗 odemg has quit IRC (Ping timeout: 265 seconds)
03:43 🔗 enowaldo has joined #archiveteam-bs
03:44 🔗 bobmcjr has joined #archiveteam-bs
03:46 🔗 bobmcjr This is probably worth scraping given this notice: https://assemblergames.com/threads/this-forum-to-close-in-30-days.71032/
03:48 🔗 JAA Already in progress.
03:51 🔗 enowaldo has quit IRC (Read error: Operation timed out)
03:52 🔗 bobmcjr Cool
03:55 🔗 odemg has joined #archiveteam-bs
04:27 🔗 bobmcjr What am I supposed to do with warcs again? I have a currently dead forum I scraped a few months ago (minus stylesheets and a few icons, sorry).
04:28 🔗 SketchCow Upload them to archive.org
04:29 🔗 SketchCow They'll go into the warczone collection
04:29 🔗 bobmcjr Alright. Webrecorder Player can't see any URLs in this warc for whatever reason. The content is there, and the format appears fine with a quick look in vim.
04:37 🔗 godane SketchCow: so finally something interesting in my search for japanese magazines
04:37 🔗 godane i found scans someone had put up on mega.nz
04:37 🔗 godane so i'm grabbing that
04:38 🔗 godane its about 10gb+ from what i can tell
04:38 🔗 enowaldo has joined #archiveteam-bs
04:46 🔗 Coderjo has quit IRC (Quit: new kernel)
04:47 🔗 enowaldo has quit IRC (Ping timeout: 492 seconds)
04:49 🔗 godane SketchCow: so i found out Kevin Savetz uploaded 3 ERIC items
04:51 🔗 godane i'm going to have go after that id range again cause i have not touch since christmas 2014 :
04:51 🔗 godane https://archive.org/details/ERIC_ED284545
04:51 🔗 godane one of savetz files : https://archive.org/details/ERIC_ED284540
05:13 🔗 godane ok then looks like savetz got a copy of that id from somewhere else
05:14 🔗 godane cause ED284540 doesn't have a url on page and this url is 404:
05:14 🔗 godane https://files.eric.ed.gov/fulltext/ED284540.pdf
05:41 🔗 wyatt8740 has joined #archiveteam-bs
05:46 🔗 godane checking AT&T Tech Channel and there Family Affair video is block worldwide https://polsy.org.uk/stuff/ytrestrict.cgi?ytid=H7BiihzcxkQ
05:47 🔗 godane by MPI Media
05:48 🔗 Flashfire hmmmmmmmmmmmm
05:48 🔗 bobmcjr has quit IRC (Read error: Operation timed out)
06:09 🔗 Zerote has joined #archiveteam-bs
06:22 🔗 c4rc4s has quit IRC (Ping timeout: 246 seconds)
06:22 🔗 c4rc4s has joined #archiveteam-bs
06:30 🔗 fuzzy8021 has quit IRC (Read error: Connection reset by peer)
06:31 🔗 fuzzy8021 has joined #archiveteam-bs
06:34 🔗 Coderjo has joined #archiveteam-bs
06:39 🔗 wyatt8740 has quit IRC (Read error: Operation timed out)
06:39 🔗 enowaldo has joined #archiveteam-bs
06:52 🔗 enowaldo has quit IRC (Read error: Operation timed out)
07:16 🔗 fuzy802 has joined #archiveteam-bs
07:21 🔗 Zerote has quit IRC (Ping timeout: 252 seconds)
07:21 🔗 fuzzy8021 has quit IRC (Ping timeout: 615 seconds)
07:26 🔗 fuzy802 is now known as fuzzy8021
07:28 🔗 godane SketchCow: this may interest you : http://www.queenzone.com/forums/1449503/complete-list-of-documentaries-1979-2018-updated.aspx
07:28 🔗 godane tons of queen documentary
07:29 🔗 godane now i found this also : https://purplehippies.com/
07:40 🔗 Zerote has joined #archiveteam-bs
08:00 🔗 enowaldo has joined #archiveteam-bs
08:05 🔗 enowaldo has quit IRC (Ping timeout: 252 seconds)
08:19 🔗 m007a83 has quit IRC (Ping timeout: 252 seconds)
08:50 🔗 m007a83 has joined #archiveteam-bs
08:54 🔗 BlueMax has quit IRC (Read error: Connection reset by peer)
09:59 🔗 w00dsman has quit IRC (Remote host closed the connection)
10:01 🔗 enowaldo has joined #archiveteam-bs
10:15 🔗 enowaldo has quit IRC (Read error: Operation timed out)
10:34 🔗 enowaldo has joined #archiveteam-bs
10:53 🔗 enowaldo has quit IRC (Read error: Operation timed out)
11:09 🔗 wp494 has quit IRC (Ping timeout: 268 seconds)
11:10 🔗 wp494 has joined #archiveteam-bs
11:26 🔗 enowaldo has joined #archiveteam-bs
11:29 🔗 kiskabak has quit IRC (Ping timeout: 265 seconds)
11:34 🔗 enowaldo has quit IRC (Ping timeout: 252 seconds)
11:54 🔗 enowaldo has joined #archiveteam-bs
12:43 🔗 icedice has joined #archiveteam-bs
13:06 🔗 terry has joined #archiveteam-bs
13:08 🔗 terry is now known as GLaDOS
13:57 🔗 deevious has quit IRC (Quit: deevious)
14:02 🔗 deevious has joined #archiveteam-bs
15:05 🔗 Zerote has quit IRC (Ping timeout: 252 seconds)
15:25 🔗 Zerote has joined #archiveteam-bs
15:59 🔗 w00dsman has joined #archiveteam-bs
16:12 🔗 icedice has quit IRC (Ping timeout: 252 seconds)
16:15 🔗 w00dsman has quit IRC (Read error: Operation timed out)
16:30 🔗 w00dsman has joined #archiveteam-bs
16:42 🔗 anarcat has joined #archiveteam-bs
16:42 🔗 anarcat hello window 51
16:43 🔗 anarcat i'll have an estimate of the dataset size of cdn.media.ccc.de in ~8h
16:46 🔗 JAA Sweet
16:47 🔗 icedice has joined #archiveteam-bs
16:52 🔗 Dj-Wawa has quit IRC (Quit: Connection closed for inactivity)
17:01 🔗 Igloo anarcat: if my maths are right, ~2 hours.
17:02 🔗 astrid has quit IRC (Read error: Operation timed out)
17:04 🔗 Somebody2 has quit IRC (Read error: Operation timed out)
17:04 🔗 MrRadar_ has quit IRC (Read error: Operation timed out)
17:04 🔗 swebb has quit IRC (Read error: Operation timed out)
17:04 🔗 yipdw has quit IRC (Read error: Operation timed out)
17:04 🔗 me has quit IRC (Read error: Operation timed out)
17:04 🔗 phirephl- has quit IRC (Read error: Operation timed out)
17:04 🔗 jrwr has quit IRC (Read error: Operation timed out)
17:05 🔗 superkuh has quit IRC (Read error: Operation timed out)
17:05 🔗 erin has quit IRC (Write error: Broken pipe)
17:05 🔗 balrog_ has joined #archiveteam-bs
17:05 🔗 chazchaz_ has quit IRC (Read error: Operation timed out)
17:06 🔗 RichardG has quit IRC (Ping timeout: 360 seconds)
17:06 🔗 RichardG has joined #archiveteam-bs
17:06 🔗 swebb has joined #archiveteam-bs
17:06 🔗 zino has quit IRC (Ping timeout: 360 seconds)
17:07 🔗 Pixi` has joined #archiveteam-bs
17:07 🔗 Pixi has quit IRC (Read error: Operation timed out)
17:07 🔗 Darkstar has quit IRC (Read error: Operation timed out)
17:07 🔗 chfoo has quit IRC (Ping timeout: 360 seconds)
17:08 🔗 nightpool has quit IRC (Ping timeout: 360 seconds)
17:08 🔗 Fionera_ has joined #archiveteam-bs
17:09 🔗 unlobito has quit IRC (Read error: Operation timed out)
17:09 🔗 phirephly has joined #archiveteam-bs
17:09 🔗 superkuh has joined #archiveteam-bs
17:09 🔗 unlobito has joined #archiveteam-bs
17:09 🔗 godane has quit IRC (Ping timeout: 360 seconds)
17:09 🔗 twigfoot has quit IRC (Ping timeout: 360 seconds)
17:09 🔗 Darkstar has joined #archiveteam-bs
17:10 🔗 Fusl anarcat: cdn.media.ccc.de for?
17:10 🔗 GLaDOS has quit IRC (Read error: Operation timed out)
17:10 🔗 Fusl my mirror password might still work
17:10 🔗 Fusl can just rsync everything off of it
17:11 🔗 Fusl i think around 1.5tb is what it was last time i had my mirror up
17:11 🔗 chfoo has joined #archiveteam-bs
17:12 🔗 Fusl yup, my password is still active
17:13 🔗 schbirid has quit IRC (Read error: Operation timed out)
17:13 🔗 nightpool has joined #archiveteam-bs
17:14 🔗 balrog has quit IRC (Read error: Operation timed out)
17:14 🔗 balrog_ is now known as balrog
17:17 🔗 chirlu` has quit IRC (Read error: Operation timed out)
17:17 🔗 Fionera has quit IRC (Read error: Operation timed out)
17:17 🔗 twigfoot has joined #archiveteam-bs
17:20 🔗 zino has joined #archiveteam-bs
17:21 🔗 godane has joined #archiveteam-bs
17:21 🔗 Igloo If that's all it is we can just chop it up into a bunch of -ao jobs
17:21 🔗 Igloo and fire them at AB
17:22 🔗 JAA Please no.
17:22 🔗 Igloo https://pastebin.com/UGyKqJdR
17:22 🔗 Igloo Anyone seen this kind of SSL error before?
17:23 🔗 Igloo WARNING ImportError: /tmp/_MEIK5xSzV/libssl.so.1.0.0: version `OPENSSL_1.0.2' not found (required by /usr/lib/python3.5/lib-dynload/_ssl.cpython-35m-x86_64-linux-gnu.so)
17:23 🔗 schbirid has joined #archiveteam-bs
17:23 🔗 Igloo Ubuntu 16.04 - wpull 2.0.1 & 1.2.3 - youtube-dl is what throws it
17:23 🔗 JAA "Your paste has triggered our automatic SPAM detection filter."
17:23 🔗 Igloo Fixed.
17:23 🔗 bobmcjr has joined #archiveteam-bs
17:25 🔗 JAA Fusl: So the thing is, an rsync mirror or similar would definitely be great if we want it as IA items. For the WBM though, we need to retrieve it over HTTP.
17:25 🔗 JAA So yeah, the question is where we want to put it and whether we want links in the WBM to work.
17:25 🔗 Fusl total size is 8,453,818,507,398
17:25 🔗 Fusl 8.5tb
17:25 🔗 JAA cdn.media.ccc.de URLs came up repeatedly in AB jobs, so there's clearly a good number of links out there.
17:26 🔗 JAA anarcat: ^ I guess you can stop your script.
17:26 🔗 Fusl also, cdn urls cant be really grabbed
17:26 🔗 Fusl the cdn itself is a redirector to other domains
17:27 🔗 Fusl https://cdn.media.ccc.de/congress/2016/webm-hd/33c3-8429-eng-deu-fra-33C3_Opening_Ceremony_webm-hd.webm?mirrorlist
17:27 🔗 Fusl check this
17:27 🔗 chirlu has joined #archiveteam-bs
17:27 🔗 Fusl it does a 302 redirect: Location: https://ftp.halifax.rwth-aachen.de/ccc/congress/2016/webm-hd/33c3-8429-eng-deu-fra-33C3_Opening_Ceremony_webm-hd.webm
17:27 🔗 JAA That's fine.
17:27 🔗 Fusl ic
17:27 🔗 JAA We'd just grab both the redirect and whatever it points to.
17:28 🔗 JAA Where that's stored exactly doesn't matter for the WBM.
17:28 🔗 JAA The CDN link would still work.
17:28 🔗 Fusl well here's the full rsync file list: http://xor.meo.ws/TRBVQ6SkRNtSnqDJC6nyKeOrr6Uo1bAP/ccc.txt
17:29 🔗 Fusl and here: https://cdn.media.ccc.de/INDEX
17:35 🔗 erin has joined #archiveteam-bs
17:36 🔗 me has joined #archiveteam-bs
17:37 🔗 jrwr has joined #archiveteam-bs
17:37 🔗 Fusl sets mode: +o jrwr
17:39 🔗 astrid has joined #archiveteam-bs
17:39 🔗 Fusl sets mode: +o astrid
17:40 🔗 MrRadar has joined #archiveteam-bs
17:41 🔗 chazchaz has joined #archiveteam-bs
17:41 🔗 Somebody2 has joined #archiveteam-bs
17:41 🔗 svchfoo1 sets mode: +o Somebody2
17:41 🔗 svchfoo3 sets mode: +o Somebody2
17:46 🔗 SketchCow Hi Jason. I found out today that you blocked my bot (@shwayest) which allows people to tweet anonymously from IRC. I am not going to try and convince you to unblock it or anything like that as I respect your decision, however I'm just curious as to what caused you to block it? I am busy adding more features and like to gather data so I can improve existing ones, such as the anti-abuse stuff.
17:46 🔗 SketchCow The fuck is this
17:47 🔗 JAA https://twitter.com/shwayest ?
17:47 🔗 Fusl ayeah
17:48 🔗 JAA The tweets there make me want to block it as well, and I don't even use Twitter.
17:48 🔗 godane is IA choking on its search index or something
17:48 🔗 Fusl "what caused you to block it" - because what you're doing is a bad idea?
17:48 🔗 godane search is not working and vhsvault is empty
17:49 🔗 Fusl godane: archive seems b0rked right now, website was fully down a few minutes ago
17:49 🔗 JAA https://twitter.com/search?q=from%3Ashwayest%20to%3Atextfiles&src=typd
17:49 🔗 SketchCow OK, I see now
17:49 🔗 godane ok
17:50 🔗 SketchCow When he said "from IRC" I assumed he meant from here
17:50 🔗 SketchCow But he means probably some ridiculous channel somewhere
17:50 🔗 JAA Yeah
17:50 🔗 SketchCow And he was tweeting at me to tell me about ASSembler
17:50 🔗 SketchCow And that's how he found out I was blocked
17:50 🔗 JAA http://www.megachan.net/proxy-tweets/
17:50 🔗 yipdw has joined #archiveteam-bs
17:51 🔗 Fusl "I fully expect the account to be banned soon due to shitposts (shitweets?) from the IRC users"
17:51 🔗 SketchCow I like that he both acknowledges that it can't ever not be a vector for abuse, but also is saaaaaaaaaaaaaaad I blocked that shit
17:51 🔗 Fusl you are expecting the worst, but you are complaining about the best outcome you had so far?
17:51 🔗 Fusl SketchCow: yeah
17:51 🔗 Fusl :D
17:51 🔗 SketchCow Anyway, sorry to distract, where am I
17:51 🔗 Fusl literally just my words
17:51 🔗 Fusl hahaha
17:57 🔗 godane looks like FOS maybe down too
17:58 🔗 kiska Yep, ok wasn't my vm not connecting to rsync then
18:00 🔗 Fusl Error: Error: connect EHOSTUNREACH 208.70.31.102:21
18:00 🔗 Fusl Thu May 30 2019 19:43:08 GMT+0200 (Central European Summer Time)
18:00 🔗 Fusl yeah
18:19 🔗 SketchCow The entire datacenter is down.
18:19 🔗 SketchCow Fiber upgrade
18:20 🔗 SketchCow Was supposed to be an hour, but it's expanding, of course
18:20 🔗 Fusl lol nice
18:20 🔗 Fusl feels more like a downgrade if you ask me :P
18:37 🔗 GLaDOS has joined #archiveteam-bs
18:46 🔗 SketchCow FOS is back.
19:13 🔗 killsushi has joined #archiveteam-bs
19:22 🔗 Despatche has quit IRC (Read error: Operation timed out)
19:37 🔗 icedice2 has joined #archiveteam-bs
19:40 🔗 icedice has quit IRC (Ping timeout: 252 seconds)
19:47 🔗 icedice2 has quit IRC (Quit: Leaving)
19:48 🔗 icedice has joined #archiveteam-bs
20:10 🔗 Despatche has joined #archiveteam-bs
20:25 🔗 godane SketchCow: i'm starting to upload some vhs tape rips i have done 2 weeks ago
20:26 🔗 godane these are Readers Digest tapes on Grand Canyon, Yellowstone, and Yosemite from 1988
20:30 🔗 thuban4 has joined #archiveteam-bs
20:32 🔗 w00dsman has quit IRC (Leaving)
20:32 🔗 enowaldo has quit IRC (Ping timeout: 268 seconds)
20:51 🔗 thuban has joined #archiveteam-bs
20:52 🔗 thuban4 has quit IRC (Read error: Operation timed out)
21:03 🔗 lindalap has joined #archiveteam-bs
21:10 🔗 lindalap has quit IRC (Quit: lindalap)
21:43 🔗 enowaldo has joined #archiveteam-bs
21:45 🔗 BlueMax has joined #archiveteam-bs
21:48 🔗 enowaldo has quit IRC (Ping timeout: 268 seconds)
22:09 🔗 icedice godane: Is there any recommended VHS ripping kit btw or do they all produce about the same quality?
22:10 🔗 godane i'm using a usb easycap
22:11 🔗 godane it was my cheap solution to capture my home recordings
22:11 🔗 godane then captures everything Jason sents me
22:23 🔗 DigiDigi has joined #archiveteam-bs
22:34 🔗 icedice Ok
22:37 🔗 phiresky has quit IRC (Quit: The Lounge - https://thelounge.chat)
22:38 🔗 Atom has joined #archiveteam-bs
22:38 🔗 BlueMax has quit IRC (Quit: Leaving)
23:03 🔗 Zerote has quit IRC (Ping timeout: 252 seconds)
23:10 🔗 Hani has quit IRC (Ping timeout: 615 seconds)
23:10 🔗 Hani has joined #archiveteam-bs
23:27 🔗 exoire has joined #archiveteam-bs
23:31 🔗 anarcat JAA: thanks for the heads up, stopped
23:31 🔗 anarcat hum
23:32 🔗 anarcat it was finished, oddly
23:32 🔗 anarcat $ wc -l all-lengths
23:32 🔗 anarcat 6727 all-lengths
23:32 🔗 anarcat $ awk '{ total += $2 } END { print total }' < all-lengths
23:32 🔗 anarcat 918453455252
23:32 🔗 JAA Maybe the server doesn't advertise the length for some (most?) URLs?
23:32 🔗 anarcat that says 918GB
23:32 🔗 anarcat but i would trust the other numbers we had before here better
23:33 🔗 anarcat anyways
23:52 🔗 godane has quit IRC (Ping timeout: 246 seconds)

irclogger-viewer