#archiveteam 2013-01-13,Sun

↑back Search

Time Nickname Message
00:19 🔗 chronomex erp
04:06 🔗 tef any attempts to archive aaron's stuff yet
04:09 🔗 kennethre tef: i'll assist in any way, if needed
04:10 🔗 tef i'm firing up my work's crawler atm on news.yc +1
04:17 🔗 Famicoman jasons been working on some stuff
04:17 🔗 Famicoman godane grabbed a copy of his site
04:17 🔗 Famicoman and I'm sure there are other things
04:19 🔗 tef ah cool, I was wondering if someone would do that
04:19 🔗 tef I did the newsyc for sopa
04:28 🔗 godane Famicoman: I got SpikeTV Xbox 360 2011 coverage
04:29 🔗 Famicoman noice
04:36 🔗 godane its also 720p version
04:37 🔗 godane whats funny is the release group calls themself "Aggressive Archive Force"
04:38 🔗 Famicoman haha wow
04:38 🔗 GLaDOS Heh
04:47 🔗 SketchCow OK, now on a proper laptop.
04:47 🔗 SketchCow We have some rough stuff.
04:48 🔗 tef SketchCow: i'm running a crawl of hn frontpage + all links appearing on it. should have snapshots, and ajax shit too.
04:48 🔗 tef in the warcs.
04:48 🔗 chronomex great
04:49 🔗 tef not sure what to do about twitter
04:49 🔗 tef especially #pdftribute
05:02 🔗 SketchCow My co-workers and I made duplicate pages.
05:02 🔗 SketchCow http://archive.org/details/ark-aaronsw
05:02 🔗 SketchCow and
05:03 🔗 SketchCow http://archive.org/details/aaronsw
05:06 🔗 tef oops
05:07 🔗 tef about halfway with the hackernews +1 link
05:13 🔗 godane https://www.youtube.com/watch?v=AqZNebWoqnc
05:13 🔗 godane that is another video for len sassaman afk
05:30 🔗 balrog_ SketchCow: I notice that several interesting sections of Aaron's website were removed and blocked with robots.txt in the past but I'm sure you're all aware
05:33 🔗 tef ppfft who looks at robots.txts. wimpy crawlers.
05:34 🔗 Cameron_D I do, to find bonus things to crawl :3
05:35 🔗 GLaDOS "Disallow? Must be some saucy stuff in here.."
05:41 🔗 balrog_ yeah but some of that was pulled too which means its not in Wayback
05:51 🔗 godane starting the upload of sega visions now
06:27 🔗 godane some of my items are not showing up
06:31 🔗 godane i'm going to bed
06:31 🔗 godane hope then internal error stuff goes away
08:11 🔗 GLaDOS Aaron Swartz is trending worldwide on twitter.
08:11 🔗 GLaDOS Wow.
08:12 🔗 kennethre GLaDOS: incredible
08:32 🔗 GLaDOS ...and he disappears.
11:06 🔗 SketchCow Huuuuug
11:09 🔗 SketchCow I'm adding a pile of material (Atari, Creative Computing, soritng BITSAVERS)
12:56 🔗 Nemo_bis NATO's ftp done: Downloaded: 16099 files, 58G in 11d 12h 4m 0s (61.6 KB/s)
14:50 🔗 schbiridi nice, Nemo_bis
16:52 🔗 emijrp i haz a script to move videos from youtube to internet archive
16:53 🔗 emijrp so, if we are ok with uploading all videos about aaron (including copyright ones) we can proceed...
17:09 🔗 ersi IA has some youtube-grabbing infra as well afaik
17:18 🔗 adamcaudi Can someone that's a bit more familiar with wget / warc files take a look at this and see if I've done anything stupid? https://gist.github.com/4524708
17:19 🔗 adamcaudi It seems right to me, but I'd rather not collect a few GB of mirrors then realize I missed something
17:23 🔗 balrog_ I hope someone's archiving the current #pdftribute
17:23 🔗 balrog_ (twitter hashtag)
17:25 🔗 emijrp and his tw account?
17:29 🔗 balrog_ hmm.
17:29 🔗 balrog_ #pdftribute is various academics posting their papers to be freely available in protest of paywalls
17:45 🔗 ersi emijrp: then again, it's always nice to have a copy if you're able to grab
17:46 🔗 emijrp i will send the script to Nemo_bis
17:46 🔗 emijrp i dont have upload bandwidth for that
17:46 🔗 emijrp go go go http://archiveteam.org/index.php?title=Aaron_Swartz
17:46 🔗 Nemo_bis emijrp: how many are they?
17:46 🔗 emijrp 250
17:46 🔗 Nemo_bis oh, should be feasible then
17:47 🔗 Nemo_bis I don't have much free upload or disk right now
17:47 🔗 SketchCow Please grab PDFtributes if possible
17:48 🔗 emijrp http://archiveteam.org/index.php?title=Aaron_Swartz/YouTube_videos
17:49 🔗 emijrp add links in wiki to the grabs, so we see what is complete
17:49 🔗 Nemo_bis emijrp: are you talking to me?
17:49 🔗 emijrp no
17:49 🔗 Nemo_bis ah ok
17:49 🔗 * Nemo_bis waiting for the script
17:49 🔗 Nemo_bis if someone else could run it I wouldn't be offended though ^^
17:52 🔗 emijrp Nemo_bis: http://code.google.com/p/emijrp/source/browse/trunk/scrapers/youtube2internetarchive.py
17:53 🔗 Nemo_bis emijrp: have you updated the collections and so on?
17:53 🔗 emijrp no
17:53 🔗 emijrp wait..
18:25 🔗 emijrp Nemo_bis: try now, read the instructions http://code.google.com/p/emijrp/source/browse/trunk/scrapers/youtube2internetarchive.py
18:26 🔗 emijrp save links to videos in download/videostodo.txt
18:26 🔗 emijrp and then python youtube2internetarchive.py english all aaronsw
18:29 🔗 Nemo_bis 'es': {'01':'january', '02': 'february', '03':'march', '04':'april', '05':'may', '06':'june', '07':'july', '08':'august','09':'september','10':'october', '11':'november', '12':'december'}
18:29 🔗 Nemo_bis File "youtube2internetarchive.py", line 59
18:29 🔗 Nemo_bis ^
18:29 🔗 Nemo_bis SyntaxError: invalid syntax
18:29 🔗 godane this is cool: http://web.archive.org/web/20120911024204/http://www.underground-gamer.com/forums.php?action=viewforum&forumid=40&page=25
18:30 🔗 emijrp fixed Nemo_bis
18:30 🔗 balrog_ godane: why is that on IA?
18:30 🔗 godane cause i mirrored it
18:30 🔗 balrog_ ah.
18:30 🔗 ersi awesome ;D
18:30 🔗 Smiley it was a int, and now you make it a string?
18:30 🔗 ersi godane: Hello there, chris1975
18:31 🔗 godane thats my username there
18:31 🔗 godane what sucks is i didn't do this to bitgamer fourms
18:32 🔗 balrog_ :(
18:32 🔗 Nemo_bis emijrp: what keys should I use?
18:34 🔗 balrog_ godane: I wish they had imported the forums to the ug forums
18:34 🔗 balrog_ would have been nice
18:34 🔗 emijrp Nemo_bis: yours?
18:34 🔗 Nemo_bis emijrp: what collection is it?
18:35 🔗 emijrp aaronsw
18:38 🔗 Nemo_bis and everyone can write to it?
18:39 🔗 emijrp dont know
18:39 🔗 emijrp you can request admin role to SketchCow ?
18:41 🔗 Nemo_bis they all seem to be erroring on download
18:41 🔗 emijrp update youtube-dl ..
18:42 🔗 godane balog_: i'm getting other crap like spiketv video game awards too
18:42 🔗 godane found some copys going back to 2008
18:44 🔗 godane so there is going to be a spiketv-specials collection in computer and tech videos collections sometime
18:45 🔗 Nemo_bis Traceback (most recent call last):
18:45 🔗 Nemo_bis File "youtube2internetarchive.py", line 138, in <module>
18:45 🔗 Nemo_bis KeyError: 'english'
18:45 🔗 Nemo_bis upload_month = num2month[language][json_['upload_date'][4:6]]
18:45 🔗 Nemo_bis emijrp: ^
18:46 🔗 ersi /query
18:47 🔗 Smiley stop trying to convert int to string?
18:47 🔗 emijrp Nemo_bis: fixed
18:47 🔗 emijrp Smiley: not that
18:48 🔗 Smiley D:
18:52 🔗 Nemo_bis emijrp: are you adding a keyword?
18:52 🔗 emijrp yes... lok the code
18:52 🔗 Nemo_bis ok
18:53 🔗 godane there is only ~8000 urls from g4tv.com feed to go
18:54 🔗 godane *thefeed
18:54 🔗 Famicoman godane let me know if you ever find the halo 2 specials done by mtv and spiketv in 2004
18:55 🔗 Famicoman Also, I think I have the first spiketv video game awards on vhs somewhere around here
18:55 🔗 godane Famicoman: did you get g4 e3 2007 or 2008
18:55 🔗 godane i'm also looking for g4 ces from 2008
18:56 🔗 Famicoman nah, I haven't found too many g4 specials
18:56 🔗 godane what do you have?
18:56 🔗 Famicoman I don't know, probably more techtv stuff than anything else
18:57 🔗 Famicoman I don't remember where I put it all
18:57 🔗 godane whats funny is i have most of that upload to archive.org now
18:57 🔗 Famicoman I feel like demonoid had a good amount of g4 stuff before it went down
18:57 🔗 Famicoman I think I had G4 comicon coverage for a few years
18:58 🔗 godane i have 2011 up and 2012 on my drive
18:58 🔗 emijrp Nemo_bis: works fine?
18:58 🔗 godane do you have any attack of the shows from 2010?
18:59 🔗 godane i have nov and dec of 2010
18:59 🔗 godane the full year of 2011
19:00 🔗 Nemo_bis emijrp: no
19:00 🔗 emijrp lol
19:00 🔗 emijrp query me
19:01 🔗 godane Famicoman: spiketv halo 2?: http://www.spike.com/full-episodes/blhn9j/gttv-halo-4-season-5-ep-528
19:14 🔗 godane now this you would not have without my help: http://web.archive.org/web/20120919075719/http://www.underground-gamer.com/forums.php?action=viewtopic&topicid=742&page=841
19:15 🔗 godane its 1500+ page forums from underground gamer in brasil
19:17 🔗 godane i am suprise how much i got from ug as far as the site looking the right
19:17 🔗 godane *the right way
20:18 🔗 dashcloud hi guys, found this: http://pdftribute.net/
20:18 🔗 dashcloud someone's getting all the #pdftribute links with papers and collecting them there
20:19 🔗 dashcloud here's a second site doing it as well: http://pdftribute.loc-com.de/
20:20 🔗 tef nice
20:20 🔗 dashcloud and this person: https://twitter.com/thejbf/statuses/290551198757560320 is archiving all of the #pdftribute tweets

irclogger-viewer