[00:19] erp [04:06] any attempts to archive aaron's stuff yet [04:09] tef: i'll assist in any way, if needed [04:10] i'm firing up my work's crawler atm on news.yc +1 [04:17] jasons been working on some stuff [04:17] godane grabbed a copy of his site [04:17] and I'm sure there are other things [04:19] ah cool, I was wondering if someone would do that [04:19] I did the newsyc for sopa [04:28] Famicoman: I got SpikeTV Xbox 360 2011 coverage [04:29] noice [04:36] its also 720p version [04:37] whats funny is the release group calls themself "Aggressive Archive Force" [04:38] haha wow [04:38] Heh [04:47] OK, now on a proper laptop. [04:47] We have some rough stuff. [04:48] SketchCow: i'm running a crawl of hn frontpage + all links appearing on it. should have snapshots, and ajax shit too. [04:48] in the warcs. [04:48] great [04:49] not sure what to do about twitter [04:49] especially #pdftribute [05:02] My co-workers and I made duplicate pages. [05:02] http://archive.org/details/ark-aaronsw [05:02] and [05:03] http://archive.org/details/aaronsw [05:06] oops [05:07] about halfway with the hackernews +1 link [05:13] https://www.youtube.com/watch?v=AqZNebWoqnc [05:13] that is another video for len sassaman afk [05:30] SketchCow: I notice that several interesting sections of Aaron's website were removed and blocked with robots.txt in the past but I'm sure you're all aware [05:33] ppfft who looks at robots.txts. wimpy crawlers. [05:34] I do, to find bonus things to crawl :3 [05:35] "Disallow? Must be some saucy stuff in here.." [05:41] yeah but some of that was pulled too which means its not in Wayback [05:51] starting the upload of sega visions now [06:27] some of my items are not showing up [06:31] i'm going to bed [06:31] hope then internal error stuff goes away [08:11] Aaron Swartz is trending worldwide on twitter. [08:11] Wow. [08:12] GLaDOS: incredible [08:32] ...and he disappears. [11:06] Huuuuug [11:09] I'm adding a pile of material (Atari, Creative Computing, soritng BITSAVERS) [12:56] NATO's ftp done: Downloaded: 16099 files, 58G in 11d 12h 4m 0s (61.6 KB/s) [14:50] nice, Nemo_bis [16:52] i haz a script to move videos from youtube to internet archive [16:53] so, if we are ok with uploading all videos about aaron (including copyright ones) we can proceed... [17:09] IA has some youtube-grabbing infra as well afaik [17:18] Can someone that's a bit more familiar with wget / warc files take a look at this and see if I've done anything stupid? https://gist.github.com/4524708 [17:19] It seems right to me, but I'd rather not collect a few GB of mirrors then realize I missed something [17:23] I hope someone's archiving the current #pdftribute [17:23] (twitter hashtag) [17:25] and his tw account? [17:29] hmm. [17:29] #pdftribute is various academics posting their papers to be freely available in protest of paywalls [17:45] emijrp: then again, it's always nice to have a copy if you're able to grab [17:46] i will send the script to Nemo_bis [17:46] i dont have upload bandwidth for that [17:46] go go go http://archiveteam.org/index.php?title=Aaron_Swartz [17:46] emijrp: how many are they? [17:46] 250 [17:46] oh, should be feasible then [17:47] I don't have much free upload or disk right now [17:47] Please grab PDFtributes if possible [17:48] http://archiveteam.org/index.php?title=Aaron_Swartz/YouTube_videos [17:49] add links in wiki to the grabs, so we see what is complete [17:49] emijrp: are you talking to me? [17:49] no [17:49] ah ok [17:49] * Nemo_bis waiting for the script [17:49] if someone else could run it I wouldn't be offended though ^^ [17:52] Nemo_bis: http://code.google.com/p/emijrp/source/browse/trunk/scrapers/youtube2internetarchive.py [17:53] emijrp: have you updated the collections and so on? [17:53] no [17:53] wait.. [18:25] Nemo_bis: try now, read the instructions http://code.google.com/p/emijrp/source/browse/trunk/scrapers/youtube2internetarchive.py [18:26] save links to videos in download/videostodo.txt [18:26] and then python youtube2internetarchive.py english all aaronsw [18:29] 'es': {'01':'january', '02': 'february', '03':'march', '04':'april', '05':'may', '06':'june', '07':'july', '08':'august','09':'september','10':'october', '11':'november', '12':'december'} [18:29] File "youtube2internetarchive.py", line 59 [18:29] ^ [18:29] SyntaxError: invalid syntax [18:29] this is cool: http://web.archive.org/web/20120911024204/http://www.underground-gamer.com/forums.php?action=viewforum&forumid=40&page=25 [18:30] fixed Nemo_bis [18:30] godane: why is that on IA? [18:30] cause i mirrored it [18:30] ah. [18:30] awesome ;D [18:30] it was a int, and now you make it a string? [18:30] godane: Hello there, chris1975 [18:31] thats my username there [18:31] what sucks is i didn't do this to bitgamer fourms [18:32] :( [18:32] emijrp: what keys should I use? [18:34] godane: I wish they had imported the forums to the ug forums [18:34] would have been nice [18:34] Nemo_bis: yours? [18:34] emijrp: what collection is it? [18:35] aaronsw [18:38] and everyone can write to it? [18:39] dont know [18:39] you can request admin role to SketchCow ? [18:41] they all seem to be erroring on download [18:41] update youtube-dl .. [18:42] balog_: i'm getting other crap like spiketv video game awards too [18:42] found some copys going back to 2008 [18:44] so there is going to be a spiketv-specials collection in computer and tech videos collections sometime [18:45] Traceback (most recent call last): [18:45] File "youtube2internetarchive.py", line 138, in [18:45] KeyError: 'english' [18:45] upload_month = num2month[language][json_['upload_date'][4:6]] [18:45] emijrp: ^ [18:46] /query [18:47] stop trying to convert int to string? [18:47] Nemo_bis: fixed [18:47] Smiley: not that [18:48] D: [18:52] emijrp: are you adding a keyword? [18:52] yes... lok the code [18:52] ok [18:53] there is only ~8000 urls from g4tv.com feed to go [18:54] *thefeed [18:54] godane let me know if you ever find the halo 2 specials done by mtv and spiketv in 2004 [18:55] Also, I think I have the first spiketv video game awards on vhs somewhere around here [18:55] Famicoman: did you get g4 e3 2007 or 2008 [18:55] i'm also looking for g4 ces from 2008 [18:56] nah, I haven't found too many g4 specials [18:56] what do you have? [18:56] I don't know, probably more techtv stuff than anything else [18:57] I don't remember where I put it all [18:57] whats funny is i have most of that upload to archive.org now [18:57] I feel like demonoid had a good amount of g4 stuff before it went down [18:57] I think I had G4 comicon coverage for a few years [18:58] i have 2011 up and 2012 on my drive [18:58] Nemo_bis: works fine? [18:58] do you have any attack of the shows from 2010? [18:59] i have nov and dec of 2010 [18:59] the full year of 2011 [19:00] emijrp: no [19:00] lol [19:00] query me [19:01] Famicoman: spiketv halo 2?: http://www.spike.com/full-episodes/blhn9j/gttv-halo-4-season-5-ep-528 [19:14] now this you would not have without my help: http://web.archive.org/web/20120919075719/http://www.underground-gamer.com/forums.php?action=viewtopic&topicid=742&page=841 [19:15] its 1500+ page forums from underground gamer in brasil [19:17] i am suprise how much i got from ug as far as the site looking the right [19:17] *the right way [20:18] hi guys, found this: http://pdftribute.net/ [20:18] someone's getting all the #pdftribute links with papers and collecting them there [20:19] here's a second site doing it as well: http://pdftribute.loc-com.de/ [20:20] nice [20:20] and this person: https://twitter.com/thejbf/statuses/290551198757560320 is archiving all of the #pdftribute tweets