#archiveteam 2014-04-22,Tue

↑back Search

Time Nickname Message
02:49 🔗 Stiletto never mind about my ftp.bocaresearch.com request, glitch tells me SketchCow already snagged a copy a while ago
02:49 🔗 Stiletto (apologies if I missed the chatlog)
03:00 🔗 Stiletto (though if he has it, it doesn't seem to be in the FTP Site Boneyard)
10:51 🔗 ats another collection of magazine scans: http://www.americanradiohistory.com/
13:14 🔗 SketchCow If someone wants to grab them and upload them via FTP, I can add them.
13:27 🔗 SmileyG SketchCow: just the pdf's right?
13:28 🔗 SketchCow Are there other things? But yeah, likely just the PDFs.
13:28 🔗 SketchCow In JSMESS news, I've been doing a top-down compiling of every machine that JSMESS supports, in stages anyway, and all the machine types I think we're best at.
13:28 🔗 SmileyG SketchCow: I've not found anything yet, just checking I didn't need a warc grab too.
13:29 🔗 SketchCow Oh yeah, no WARC needed.
13:29 🔗 SmileyG I'm going to give it a try, but anyone else can feel free to tell me they've done it.
13:29 🔗 SmileyG https://stackoverflow.com/questions/19883073/download-all-pdf-files-using-wget << seems nice way?
13:34 🔗 SmileyG K, it's going
13:46 🔗 SadDM That's going to be a monster. There are thousands of files there. Good on you for taking it on.
13:47 🔗 SadDM SketchCow: Did anything ever come of the site from a while back that had the Sears/JCPenny/other catalogues? I seem to recall that you contacted the owner.
13:47 🔗 SmileyG well that didn't work well D:
13:47 🔗 * SmileyG tries something else
13:49 🔗 SmileyG k a wget with -A pdf is working better.
13:57 🔗 godane so the Chinese are helping me save news videos from NBC and ABC
13:57 🔗 godane that just makes me sad on the inside
13:59 🔗 SmileyG lol
13:59 🔗 SketchCow I talked with the guy who did the catalogs.
13:59 🔗 SketchCow He doesn't want to go on archive.org yet, has some dream or other.
13:59 🔗 SketchCow But he might in the future.
14:00 🔗 SketchCow I owe him a mail back, actually. He wanted to know what my site has "planned" for his material.
14:00 🔗 SketchCow Uh, shove it into a collection and never think of it again?
14:00 🔗 SketchCow Just working on how to phrase that hotness.
14:01 🔗 SmileyG :D
14:03 🔗 balrog SketchCow: take it and dark it for now?
14:06 🔗 SketchCow Feh, I'll work it out later.
14:07 🔗 SketchCow I'm hanging out in NYC today. Trying to get a few things done on the onlines.
14:07 🔗 SketchCow JSMESS repair is on the list, almost have my shit together.
14:08 🔗 midas hm, next thing: archiving archive.org for the list? :p
14:10 🔗 SketchCow Personally, I would love if a wiki page on archiveteam.org discussed all the ways known to export data out of archive.org.
14:10 🔗 SketchCow archiveteam.org can probably use a cleaning generally, actually.
14:11 🔗 SketchCow I'm glad we cut back on the spam.
14:14 🔗 Nemo_bis Like "Sneak into the IA datacentre and load all hard disks on a truck"
14:15 🔗 Nemo_bis Yes, QuestyCaptcha is nice
14:24 🔗 midas s/truck/trucks probably Nemo_bis :p
14:24 🔗 midas and dont forget, these disks are full so they are heavy ;-)
14:25 🔗 Nemo_bis hehe
14:28 🔗 Nemo_bis Tapes are still ridiculously cheap and the Internet2 link seems way far from being at capacity
14:29 🔗 SmileyG lets start fundraising? :D
14:29 🔗 Nemo_bis Probably some USA (or even American in general) university lab, with the help of some cheap/free student labour, could easily download all IA data
14:29 🔗 godane you guys also forgot about my 1PB dvd plan
14:29 🔗 Nemo_bis yes godane, but imagine if your cat scratches it
14:30 🔗 Nemo_bis "my cat just deleted the whole german literature"
14:30 🔗 godane thats why you make 50 copys
14:30 🔗 godane also i don't allow my cats in my room
14:31 🔗 Nemo_bis cats don't obey humans, but the opposite; it's the first law of catness
14:31 🔗 SmileyG Nemo_bis: I'd like to get a copy out of the US tbh D:
14:31 🔗 SmileyG but, internet2 over here? not sure it exists...
14:32 🔗 SmileyG Anyway to #archiveteam-bs !
14:32 🔗 Nemo_bis In theory any Geant university woulddo
19:12 🔗 Smiley damnit, wget locked up my system I think for that ealrier grab, trying again now
19:27 🔗 midas https://en.wikipedia.org/wiki/Holographic_Versatile_Disc
19:37 🔗 schbirid re quakedev.com, the domain is indeed lost to squatters
19:38 🔗 schbirid can we spoof/fake a wget crawl on a local server and get that into the wayback machine? would be both awesome and scary if
19:41 🔗 Nemo_bis http://diskdigger.org/
20:10 🔗 exmic schbirid: yes, it's possible.
20:26 🔗 balrog yes you use a hosts file
20:26 🔗 balrog or rather a hosts file entryt
20:26 🔗 balrog -t
20:50 🔗 ersi Remember to be on the lookout for potential subdomains of quakedev.com, if you're gonna hard code it into your hosts file or such.
20:57 🔗 DFJustin I don't think ia wants falsified warcs
21:06 🔗 balrog DFJustin: it's not falsified.
21:06 🔗 balrog I have such a warc of hymn-project.org
21:06 🔗 balrog the domain name fell off but the ip address is still alive
21:07 🔗 balrog 184.105.182.100
21:07 🔗 DFJustin that's more borderline, schbirid is talking about a local backup of a site that's gone
21:07 🔗 balrog you mean a local backup of the entire server?
21:08 🔗 DFJustin the site contents
21:08 🔗 balrog so that if you bring up httpd it's the same disk/os/etc?
21:08 🔗 balrog wget rip of a wget rip -- nope
21:08 🔗 DFJustin at minimum there are dating issues, additionally there are going to be differences in server responses etc
21:08 🔗 balrog wget rip of a copy of the server put up might be ok
21:08 🔗 balrog (as in the server disk)
21:10 🔗 DFJustin what I would do in that case is mirror the content on your own public site, and then crawl that for wayback if you want
21:11 🔗 DFJustin he disconnected an hour ago though so he won't see any of this discussion
21:13 🔗 balrog yeah
23:25 🔗 DFJustin http://www.buzzfeed.com/kevintang/inside-chinas-insane-witch-hunt-for-slash-fiction-writers

irclogger-viewer