#archiveteam 2014-01-01,Wed

↑back Search

Time Nickname Message
01:01 🔗 Cameron_D http://allthingsd.com/20131231/you-say-goodbye-and-we-say-hello/
02:17 🔗 joepie91 Cameron_D: right
02:17 🔗 joepie91 game plan?
02:17 🔗 joepie91 because it sounds like we have less than 24 hours
02:18 🔗 joepie91 fuck it
02:18 🔗 joepie91 will throw it into archivebot
02:18 🔗 joepie91 and see what happens
02:20 🔗 Cameron_D that should get most of it
02:20 🔗 Cameron_D althoug they do seem to have a fair chunk of video content http://allthingsd.com/video/
02:27 🔗 joepie91 Cameron_D: I have to say that I'm a bit taken aback by the non-noisyness of the URLs on allthingsd
02:27 🔗 joepie91 it all seems... pretty sane
02:29 🔗 Cameron_D yeah, which is nice
02:29 🔗 joepie91 yes :P
02:30 🔗 Cameron_D The comments are all hosted externally so I don't think its downloading them
02:46 🔗 joepie91 Cameron_D
02:46 🔗 joepie91 it seems to do comments fine
02:46 🔗 joepie91 it's grabbing stuff from avatars.fyre.co anyway
02:46 🔗 joepie91 (fyre == livefyre == afaik the comments system they use)
02:49 🔗 Cameron_D ah cool
04:09 🔗 godane i think i'm grabbing allthingsd videos
04:09 🔗 godane m.wsj.net/video/ is not 403 error
04:09 🔗 godane i can grab all of it
05:30 🔗 godane Happy New Year!
05:32 🔗 dashcloud happy new year!
05:37 🔗 BiggieJon watch live.twit.tv, much beter then any network tv
05:56 🔗 ivan` someone with upstream please grab https://www.youtube.com/user/AllThingsD/videos
05:56 🔗 ivan` youtube-dl handles /user/ URLs
05:57 🔗 ivan` youtube-dl --title --continue --retries 4 --write-info-json --write-description --write-thumbnail --write-annotations --all-subs --ignore-errors "https://www.youtube.com/user/AllThingsD/videos"
05:59 🔗 godane i found a way to grab the wsj source video
05:59 🔗 godane there is going to be tons of warc.gz of that
05:59 🔗 ivan` cool
06:00 🔗 godane ivan`: http://m.wsj.net/video/
06:00 🔗 godane all videos i think of wsj is there
06:46 🔗 Dessiato Where can I find an archive of /soc/ about 3-4 months ago
06:55 🔗 wp494 ah hell no
06:55 🔗 wp494 http://techcrunch.com/2013/12/31/google-to-close-bump-and-flock-its-recently-acquired-file-sharing-apps/
07:21 🔗 Cameron_D aww man, bump was great
16:18 🔗 Schbirid anyone bored? some netlabel that could use its releases put into IA. i am not affiliated, just randomly found it. please be nice and slow as it is a lot of releases. tell me if you are doing it! http://www.darklandrecordings.com/releases
16:19 🔗 Schbirid also:
16:19 🔗 Schbirid http://www.starquakerecords.com/all.html
16:21 🔗 Schbirid and http://odgprod.com/
16:25 🔗 Schbirid last one should be easy http://odgprod.com/son/zip/ (but of course extracting and metadata is the hard work anyways)
16:30 🔗 Schbirid another http://www.endlessascent.com/
16:52 🔗 godane !ao http://www.slate.com/blogs/behold/2013/12/30/paula_salischiker_photographs_hoarders_in_britain_in_her_series_the_art.html
16:52 🔗 godane sorry
16:52 🔗 godane wrong channel
16:52 🔗 Smiley :D
18:36 🔗 SketchCow Internet Archive got $1.3 million for fund drive
18:39 🔗 Smiley \o/
18:40 🔗 SketchCow 10 petabytes to be purchased for disk space, apparently.
18:44 🔗 balrog wow nice!
18:48 🔗 ersi Yay!
19:37 🔗 SketchCow Yes, we've not quite outgrown the archive yet.
19:45 🔗 balrog yet.
19:46 🔗 SketchCow We are a little nutty with the space.
19:48 🔗 Nemo_bis I can upload the whole Wikimedia Commons repository 40 times in that space, hmm
19:48 🔗 turnip Oh god please don't
19:49 🔗 Nemo_bis ...of course not
19:52 🔗 Smiley :D
19:52 🔗 Smiley that'd be odd.
20:28 🔗 balrog http://blog.bu.mp/post/71781606704/all-good-things
21:12 🔗 zenguy_pc how long will 10 petabytes last?
21:13 🔗 SketchCow We estimate 18 months
21:14 🔗 Nemo_bis Did the on-demand wayback machine archiving increase the rate at which space is consumed?
21:15 🔗 godane SketchCow: i figured you would want to know about this: http://m.wsj.net/video/
21:16 🔗 godane all wall street journal videos
21:16 🔗 godane i'm making a collection of sorts: https://archive.org/search.php?query=creator%3A%22m.wsj.net%22
21:16 🔗 Schbirid awesome
21:17 🔗 godane think all things d is in the 19000xxx numbers
21:20 🔗 godane also you should know this bug has been around since christmas
21:20 🔗 godane based on google cache
21:23 🔗 ivan` SketchCow: any guess on how many TB of YouTube wayback has?
21:27 🔗 SketchCow Oh no idea.
22:05 🔗 godane i'm doing a grab of the index of m.wsj.net/video/
22:06 🔗 godane that way we can at least grab the files even if this folder is 403 again
22:08 🔗 DFJustin ivan`: 932.48 TB in the youtubecrawl collection
22:23 🔗 ivan` DFJustin: wow, my guess was closer to 200TB
22:25 🔗 ivan` that's a lot of YouTube
22:37 🔗 SketchCow Mmmmm, sorting godane uploads

irclogger-viewer