[02:49] never mind about my ftp.bocaresearch.com request, glitch tells me SketchCow already snagged a copy a while ago [02:49] (apologies if I missed the chatlog) [03:00] (though if he has it, it doesn't seem to be in the FTP Site Boneyard) [10:51] another collection of magazine scans: http://www.americanradiohistory.com/ [13:14] If someone wants to grab them and upload them via FTP, I can add them. [13:27] SketchCow: just the pdf's right? [13:28] Are there other things? But yeah, likely just the PDFs. [13:28] In JSMESS news, I've been doing a top-down compiling of every machine that JSMESS supports, in stages anyway, and all the machine types I think we're best at. [13:28] SketchCow: I've not found anything yet, just checking I didn't need a warc grab too. [13:29] Oh yeah, no WARC needed. [13:29] I'm going to give it a try, but anyone else can feel free to tell me they've done it. [13:29] https://stackoverflow.com/questions/19883073/download-all-pdf-files-using-wget << seems nice way? [13:34] K, it's going [13:46] That's going to be a monster. There are thousands of files there. Good on you for taking it on. [13:47] SketchCow: Did anything ever come of the site from a while back that had the Sears/JCPenny/other catalogues? I seem to recall that you contacted the owner. [13:47] well that didn't work well D: [13:47] * SmileyG tries something else [13:49] k a wget with -A pdf is working better. [13:57] so the Chinese are helping me save news videos from NBC and ABC [13:57] that just makes me sad on the inside [13:59] lol [13:59] I talked with the guy who did the catalogs. [13:59] He doesn't want to go on archive.org yet, has some dream or other. [13:59] But he might in the future. [14:00] I owe him a mail back, actually. He wanted to know what my site has "planned" for his material. [14:00] Uh, shove it into a collection and never think of it again? [14:00] Just working on how to phrase that hotness. [14:01] :D [14:03] SketchCow: take it and dark it for now? [14:06] Feh, I'll work it out later. [14:07] I'm hanging out in NYC today. Trying to get a few things done on the onlines. [14:07] JSMESS repair is on the list, almost have my shit together. [14:08] hm, next thing: archiving archive.org for the list? :p [14:10] Personally, I would love if a wiki page on archiveteam.org discussed all the ways known to export data out of archive.org. [14:10] archiveteam.org can probably use a cleaning generally, actually. [14:11] I'm glad we cut back on the spam. [14:14] Like "Sneak into the IA datacentre and load all hard disks on a truck" [14:15] Yes, QuestyCaptcha is nice [14:24] s/truck/trucks probably Nemo_bis :p [14:24] and dont forget, these disks are full so they are heavy ;-) [14:25] hehe [14:28] Tapes are still ridiculously cheap and the Internet2 link seems way far from being at capacity [14:29] lets start fundraising? :D [14:29] Probably some USA (or even American in general) university lab, with the help of some cheap/free student labour, could easily download all IA data [14:29] you guys also forgot about my 1PB dvd plan [14:29] yes godane, but imagine if your cat scratches it [14:30] "my cat just deleted the whole german literature" [14:30] thats why you make 50 copys [14:30] also i don't allow my cats in my room [14:31] cats don't obey humans, but the opposite; it's the first law of catness [14:31] Nemo_bis: I'd like to get a copy out of the US tbh D: [14:31] but, internet2 over here? not sure it exists... [14:32] Anyway to #archiveteam-bs ! [14:32] In theory any Geant university woulddo [19:12] damnit, wget locked up my system I think for that ealrier grab, trying again now [19:27] https://en.wikipedia.org/wiki/Holographic_Versatile_Disc [19:37] re quakedev.com, the domain is indeed lost to squatters [19:38] can we spoof/fake a wget crawl on a local server and get that into the wayback machine? would be both awesome and scary if [19:41] http://diskdigger.org/ [20:10] schbirid: yes, it's possible. [20:26] yes you use a hosts file [20:26] or rather a hosts file entryt [20:26] -t [20:50] Remember to be on the lookout for potential subdomains of quakedev.com, if you're gonna hard code it into your hosts file or such. [20:57] I don't think ia wants falsified warcs [21:06] DFJustin: it's not falsified. [21:06] I have such a warc of hymn-project.org [21:06] the domain name fell off but the ip address is still alive [21:07] 184.105.182.100 [21:07] that's more borderline, schbirid is talking about a local backup of a site that's gone [21:07] you mean a local backup of the entire server? [21:08] the site contents [21:08] so that if you bring up httpd it's the same disk/os/etc? [21:08] wget rip of a wget rip -- nope [21:08] at minimum there are dating issues, additionally there are going to be differences in server responses etc [21:08] wget rip of a copy of the server put up might be ok [21:08] (as in the server disk) [21:10] what I would do in that case is mirror the content on your own public site, and then crawl that for wayback if you want [21:11] he disconnected an hour ago though so he won't see any of this discussion [21:13] yeah [23:25] http://www.buzzfeed.com/kevintang/inside-chinas-insane-witch-hunt-for-slash-fiction-writers