[01:03] home again, home again; jiggidy jig [01:03] http://code.google.com/p/ajaxportal/ [01:03] This is awesomely awful [01:05] heh [02:05] underscor: I keep hoping that AJAX portal thing is a joke [02:06] underscor: but the more I read, the more I think that they're serious [02:09] hahahaha [02:20] wow [02:20] ten minutes to do an ls -s [02:20] and counting! [02:20] I <3 EC2 [02:21] so, pointless numbers [02:21] the biggest WARC I have from Proust is 34.7 MB, from a user named "tom" [02:22] OH [02:22] that's Tom Cortese [02:22] who is a co-founder of proust.com [02:22] haha [02:24] oh, and the median WARC size is 3.5 MB [02:28] brb, rugula cookies :o [06:16] godfuckingdamnit I'm still getting splinder noconn errors [06:16] I'm starting to think that those are just going to be a fact of life [07:14] .ist gay [08:42] esteemed ladies and gentlemen of #archiveteam [08:42] http://wayback.at.ninjawedding.org/ [08:42] example: http://wayback.at.ninjawedding.org/20120107095010/http://blog.proust.com/ [08:42] I need to figure out a way to get WARC uploads to this thing going [08:42] maybe a webapp [08:43] Nice domain name. [08:43] * Wyatt|Wor approves. [08:43] heh [08:43] in the meantime, if you want to be able to upload WARCs, uh [08:43] I guess ask me for SSH access or something? [08:43] I dunno [08:43] there's not that much space on it; it's just an EC2 micro [08:43] Hehe, fair enough. [08:43] I figured a place to easily view WARCs would help [08:44] ACTUALLY [08:44] * yipdw tries to whip up an upload webapp in 30 minutes [08:45] "Yesterday 2 of 24 pings failed. That doesn't mean my site isn't working all the time. It means it failed 8% of the time -- unacceptable." [08:45] Dunning-Kreuger Effect, hooooo~ [08:48] http://wayback.at.ninjawedding.org/20120109070224/http://www.proust.com/story/jordancohen/entry/229465/when-did-you-meet-your-best-friend [08:57] wow, something is seriously killing that server [09:27] ok, I've got an upload endpoint [09:30] it's https://github.com/capotej/uploadd running on http://wayback.at.ninjawedding.org:9292; see that link for interface specification [09:32] It seems like Archive.org's WBM barfed yesterday [09:32] No gateway to internet [09:33] Also, I almost find it silly it needs to check the live robots.txt for every access [09:45] actually, meh, it's not working and I need to crash [09:45] i'll fix it later [09:53] Night, yipdw [14:52] MORNING [16:04] http://media.tumblr.com/tumblr_lxgpk8HC0S1qfvo9z.jpg [16:30] mrnnngh [16:41] chronomex: I'm slow on responding to guy, getting things done and your step-carefully instructions make me carefully set aside distractions to talk. [16:42] * chronomex nods quietly [16:42] SketchCow, is that Nick Cage's first appearance? :) [16:42] he mail you? cool. [16:42] No. [16:43] I just mean YOUR mail. [16:43] o ok [16:50] http://a6.sphotos.ak.fbcdn.net/hphotos-ak-ash4/380952_347759215253538_205344452828349_1303147_1960047679_n.jpg [16:50] care for a game of dwarf tossing? [17:44] So everyone knows. [17:44] I'm moving all the archiveteam sites over the underscor's hosting company, as I've decided a hosting company run by insane teenagers can't be worse than Dreamhost at this juncture. [17:45] After that's done, I'll high-focus on getting our google search juice back to normal. [17:45] didn't know underscor had a hosting company [17:45] He does. [17:45] underscor is good peoples [17:47] He has a brother and sister who look just like him. [17:47] So if we work him to death, I can replace him [17:48] haha [17:48] bwahahaa [17:50] lol [17:52] I'm willing to bet underscor's crack team of insane teenagers is better than our (JL's) not-so-crack team of dwindling twenty-somethings. :/ [17:52] JL? [17:53] Sorry, Jumpline. The company of many faces I work for on weekends. [17:53] oh [18:05] SketchCow: I just learned of a Usenet archive publiched on CD in 1992-95. If you run across Sterling Software's NetNews/CD at some point.. http://tidbits.com/article/3229 [18:06] that's apparently what google used for that time period [18:10] Will grab. [18:44] btw I seem to have not uploaded my splinder data [18:45] can someone tell me what I need to do (iirc rsync slots?) [18:45] yes rsync. talk to SketchCow [18:57] the ssh client shipped with OpenSSH 5.2_p1 on OS X seems to be slightly stupid [18:57] if I pass it an identity to use with -i, the thing *still* tries all identities it knows about [18:57] which results in weird "Too many authentication failures" errors [19:01] weird [19:06] yipdw: As in you ssh -i path/to/key user@host and it still does it? [19:09] Wyatt: yes [19:09] Bizarre. [19:09] Wyatt: further debugging switches reveal that the client doesn't recognize the format of the identity file [19:09] no idea why; it works on everything else, including another OS X Snow Leopard machine [19:10] I'm just going to chalk it up to "random stupid bullshit" and use said other machine [19:10] I blame...err... dog! I blame dog. [19:47] wow, the NetNews CDs seem even more scarce than the PhoneDisc CDs I've been attempting to accumulate [19:48] though it appears Google did get ahold of a set at one point... [20:19] haha [20:19] [ec2-user@ip-10-243-119-16 data]$ time cd H/ [20:19] real 0m1.763s [20:20] SketchCow: I'm going to dump a bunch of Proust data on Batcave; it's looking like it'll be in in the tens-of-gigs range [20:21] is that okay? [20:28] DO IT [20:32] okey [20:33] SketchCow: it's just the public profiles for now, by the way. I haven't found a way around Proust's authorization measures [20:42] Another takedown! [20:42] It's an excellent first step, yipdw [20:46] SketchCow: no problem. also, if you want to see an example of what's been saved so far, hit up http://wayback.at.ninjawedding.org/20120109065146/http://www.proust.com/story/tom/ [20:47] http://www.archive.org/details/moves-magazine [20:47] Being yanked down [20:47] Proust does annoying things re: returning roots, so returning to the storybook mode from the given links won't work unless you manually suffix a / [20:47] I'll look into ways to fix that [20:47] underscor: wake up [20:47] hmm [20:48] time to archive the archive [20:48] Yes, exactly. [20:48] underscor has written an archive.org archiver for me - I want him to test it on edding.org/20120109065146/http://www.proust.com/story/tom/ primus104 [20:48] 16:10 <@SketchCow> http://www.archive.org/details/moves-magazine [20:48] ha ha pasteeeeeee [20:51] We are the owners of MOVES magazine and all content published in all issues. We have not authorized the scanning of our magazine issues or the uploading of these files. [20:51] [20:51] I declare, under penalty of perjury, that this information is accurate and that I am the designated individual to act on behalf of this corporation. [20:51] [20:51] UNDER. PENALTY. OF. PERJURY. [20:51] The best informal correspondence has those words. [20:52] might as well be informal for all the penalties of perjury a DMCA takedown has resulted in [21:02] http://www.therestartpage.com/# [21:02] SketchCow: here now [21:02] http://www.archive.org/details/moves-magazine [21:02] Grab it [21:03] Test your little program set [21:03] I'm negotiating with the rights holders and the discussion is not going to go well [21:03] -- [21:03] Dr. Cummins: [21:03] My apologies for any stress. My name is Jason Scott, and I'm the person who found scans of Moves magazine online and uploaded them to the Internet Archive. As the archive is a recognized library and a registered non-profit that makes available lots of information to the world, and as I thought the issues of Moves Magazine, being 30 years old were of more historical interest than anything else, I put them there. I acquired indexes to them from Greg [21:04] Is there any agreement, linking or any other situation that could convince you and/or Strategies and Tactics to make them available on archive.org? They don't seem available under any other situation and it would be a shame for historical documents to be unavailable for researchers. [21:04] -- [21:04] I only share this to show people how I try to reason with folks. [21:04] It's running [21:05] 0 9:04PM:abuie@abuie-dev:~ 18653 π ruby ia_grab.rb moves-magazine [21:05] moves-magazine is a collection [21:05] Let's fetch the list of its members [21:05] We found 60 results [21:05] Mirroring moves-magazine-01 because its parent is moves-magazine [21:05] http://ia600807.us.archive.org/17/items/moves-magazine-01/Moves-Magazine-01.djvu to Moves-Magazine-01.djvu [21:05] ...........................................................................................http://ia600807.us.archive.org/17/items/moves-magazine-01/Moves-Magazine-01.epub to Moves-Magazine-01.epub [21:06] SketchCow: btw, the first paragraph got cropped at "I acquired indexes to them from Greg", if you intended to share the whole thing. [21:06] Greg Costikyan, and matched them to the digitized magazines, which is where the writing came from. [21:06] I also would like to know how long your absorb takes, underscor [21:07] Ok [21:08] It's downloaded 15 magazines in 5 minutes [21:08] so we should be done with the whole collection in like 20 [21:08] Oh, actually, let me stop another process, it's pegging the cpu [21:08] Might speed it up [21:08] does it also pull the item detail page and the xml files? [21:09] Just the _files.xml and the _meta.xml [21:09] It pulls everything _from_ the item [21:09] But not anything generated by archive.org [21:09] Since that can be reconstructed from the meta.xml [21:09] Right. [21:09] alright [21:09] that's cool [21:10] 22 now [21:11] http://i.imgur.com/GDtpU.png [21:11] It's rather mesmerizing to watch [21:11] I should probably stick some newlines in there [21:12] so you are grabbing the derived files as well? [21:13] Yeah [21:13] Sorry, I should have been more clear [21:13] I'm grabbing everything that's listed in the _files.xml [21:13] But none of the HTML pages generated by archive.org [21:30] Oh, it finished [21:30] 22 minutes for 3.5GB [21:31] SketchCow: It's at ~abuie/moves-magazine on home.us [21:32] anyway -- i tested the blogger wget dump on custom domains and it works [21:34] Thanks. [21:35] So, Takedown #2? [21:35] I've talked with him. [21:35] I interviewed him 7 years ago for the BBS Documentary. [21:35] We're good [21:35] * thunderah awaits emijrp [21:35] Takedown #2? [22:21] TAKEDOWN II [22:21] JUDGMENT DAY [22:25] Cheap Web Hosting (3GB Disk Space, 3GB Data Transfer) - urlte.am (01/09/2012 - 02/08/2012) $3.00 USD [22:25] MOTHERFUCKING RIPOFF ARTISTS [22:25] RRGRRHH [22:28] GUESS THE KID'S NOT GETTING A NEW DRESS FOR PROM [22:29] Wow, my internet just went seriously to shit. [22:33] that's...a lot less than what I'm paying [22:45] that's a lot more than what I'm paying :P [22:45] but the pricing scale isn't comparable [23:35] SketchCow: Oh, I thought you were getting me a dress made out of floppy disks