Time |
Nickname |
Message |
02:49
🔗
|
Stiletto |
never mind about my ftp.bocaresearch.com request, glitch tells me SketchCow already snagged a copy a while ago |
02:49
🔗
|
Stiletto |
(apologies if I missed the chatlog) |
03:00
🔗
|
Stiletto |
(though if he has it, it doesn't seem to be in the FTP Site Boneyard) |
10:51
🔗
|
ats |
another collection of magazine scans: http://www.americanradiohistory.com/ |
13:14
🔗
|
SketchCow |
If someone wants to grab them and upload them via FTP, I can add them. |
13:27
🔗
|
SmileyG |
SketchCow: just the pdf's right? |
13:28
🔗
|
SketchCow |
Are there other things? But yeah, likely just the PDFs. |
13:28
🔗
|
SketchCow |
In JSMESS news, I've been doing a top-down compiling of every machine that JSMESS supports, in stages anyway, and all the machine types I think we're best at. |
13:28
🔗
|
SmileyG |
SketchCow: I've not found anything yet, just checking I didn't need a warc grab too. |
13:29
🔗
|
SketchCow |
Oh yeah, no WARC needed. |
13:29
🔗
|
SmileyG |
I'm going to give it a try, but anyone else can feel free to tell me they've done it. |
13:29
🔗
|
SmileyG |
https://stackoverflow.com/questions/19883073/download-all-pdf-files-using-wget << seems nice way? |
13:34
🔗
|
SmileyG |
K, it's going |
13:46
🔗
|
SadDM |
That's going to be a monster. There are thousands of files there. Good on you for taking it on. |
13:47
🔗
|
SadDM |
SketchCow: Did anything ever come of the site from a while back that had the Sears/JCPenny/other catalogues? I seem to recall that you contacted the owner. |
13:47
🔗
|
SmileyG |
well that didn't work well D: |
13:47
🔗
|
* |
SmileyG tries something else |
13:49
🔗
|
SmileyG |
k a wget with -A pdf is working better. |
13:57
🔗
|
godane |
so the Chinese are helping me save news videos from NBC and ABC |
13:57
🔗
|
godane |
that just makes me sad on the inside |
13:59
🔗
|
SmileyG |
lol |
13:59
🔗
|
SketchCow |
I talked with the guy who did the catalogs. |
13:59
🔗
|
SketchCow |
He doesn't want to go on archive.org yet, has some dream or other. |
13:59
🔗
|
SketchCow |
But he might in the future. |
14:00
🔗
|
SketchCow |
I owe him a mail back, actually. He wanted to know what my site has "planned" for his material. |
14:00
🔗
|
SketchCow |
Uh, shove it into a collection and never think of it again? |
14:00
🔗
|
SketchCow |
Just working on how to phrase that hotness. |
14:01
🔗
|
SmileyG |
:D |
14:03
🔗
|
balrog |
SketchCow: take it and dark it for now? |
14:06
🔗
|
SketchCow |
Feh, I'll work it out later. |
14:07
🔗
|
SketchCow |
I'm hanging out in NYC today. Trying to get a few things done on the onlines. |
14:07
🔗
|
SketchCow |
JSMESS repair is on the list, almost have my shit together. |
14:08
🔗
|
midas |
hm, next thing: archiving archive.org for the list? :p |
14:10
🔗
|
SketchCow |
Personally, I would love if a wiki page on archiveteam.org discussed all the ways known to export data out of archive.org. |
14:10
🔗
|
SketchCow |
archiveteam.org can probably use a cleaning generally, actually. |
14:11
🔗
|
SketchCow |
I'm glad we cut back on the spam. |
14:14
🔗
|
Nemo_bis |
Like "Sneak into the IA datacentre and load all hard disks on a truck" |
14:15
🔗
|
Nemo_bis |
Yes, QuestyCaptcha is nice |
14:24
🔗
|
midas |
s/truck/trucks probably Nemo_bis :p |
14:24
🔗
|
midas |
and dont forget, these disks are full so they are heavy ;-) |
14:25
🔗
|
Nemo_bis |
hehe |
14:28
🔗
|
Nemo_bis |
Tapes are still ridiculously cheap and the Internet2 link seems way far from being at capacity |
14:29
🔗
|
SmileyG |
lets start fundraising? :D |
14:29
🔗
|
Nemo_bis |
Probably some USA (or even American in general) university lab, with the help of some cheap/free student labour, could easily download all IA data |
14:29
🔗
|
godane |
you guys also forgot about my 1PB dvd plan |
14:29
🔗
|
Nemo_bis |
yes godane, but imagine if your cat scratches it |
14:30
🔗
|
Nemo_bis |
"my cat just deleted the whole german literature" |
14:30
🔗
|
godane |
thats why you make 50 copys |
14:30
🔗
|
godane |
also i don't allow my cats in my room |
14:31
🔗
|
Nemo_bis |
cats don't obey humans, but the opposite; it's the first law of catness |
14:31
🔗
|
SmileyG |
Nemo_bis: I'd like to get a copy out of the US tbh D: |
14:31
🔗
|
SmileyG |
but, internet2 over here? not sure it exists... |
14:32
🔗
|
SmileyG |
Anyway to #archiveteam-bs ! |
14:32
🔗
|
Nemo_bis |
In theory any Geant university woulddo |
19:12
🔗
|
Smiley |
damnit, wget locked up my system I think for that ealrier grab, trying again now |
19:27
🔗
|
midas |
https://en.wikipedia.org/wiki/Holographic_Versatile_Disc |
19:37
🔗
|
schbirid |
re quakedev.com, the domain is indeed lost to squatters |
19:38
🔗
|
schbirid |
can we spoof/fake a wget crawl on a local server and get that into the wayback machine? would be both awesome and scary if |
19:41
🔗
|
Nemo_bis |
http://diskdigger.org/ |
20:10
🔗
|
exmic |
schbirid: yes, it's possible. |
20:26
🔗
|
balrog |
yes you use a hosts file |
20:26
🔗
|
balrog |
or rather a hosts file entryt |
20:26
🔗
|
balrog |
-t |
20:50
🔗
|
ersi |
Remember to be on the lookout for potential subdomains of quakedev.com, if you're gonna hard code it into your hosts file or such. |
20:57
🔗
|
DFJustin |
I don't think ia wants falsified warcs |
21:06
🔗
|
balrog |
DFJustin: it's not falsified. |
21:06
🔗
|
balrog |
I have such a warc of hymn-project.org |
21:06
🔗
|
balrog |
the domain name fell off but the ip address is still alive |
21:07
🔗
|
balrog |
184.105.182.100 |
21:07
🔗
|
DFJustin |
that's more borderline, schbirid is talking about a local backup of a site that's gone |
21:07
🔗
|
balrog |
you mean a local backup of the entire server? |
21:08
🔗
|
DFJustin |
the site contents |
21:08
🔗
|
balrog |
so that if you bring up httpd it's the same disk/os/etc? |
21:08
🔗
|
balrog |
wget rip of a wget rip -- nope |
21:08
🔗
|
DFJustin |
at minimum there are dating issues, additionally there are going to be differences in server responses etc |
21:08
🔗
|
balrog |
wget rip of a copy of the server put up might be ok |
21:08
🔗
|
balrog |
(as in the server disk) |
21:10
🔗
|
DFJustin |
what I would do in that case is mirror the content on your own public site, and then crawl that for wayback if you want |
21:11
🔗
|
DFJustin |
he disconnected an hour ago though so he won't see any of this discussion |
21:13
🔗
|
balrog |
yeah |
23:25
🔗
|
DFJustin |
http://www.buzzfeed.com/kevintang/inside-chinas-insane-witch-hunt-for-slash-fiction-writers |