#archiveteam 2013-06-11,Tue

↑back Search

Time Nickname Message
02:06 🔗 Shicky256 I'm having a problem with Warrior.
02:06 🔗 Shicky256 I'm working on the Formspring project, but it keeps failing uploads.
02:07 🔗 Shicky256 Anyone have an idea why?
02:12 🔗 Shicky256 oh well.
02:12 🔗 Shicky256 I hope that this at least gets to whoever made Warrior.
06:08 🔗 SketchCow http://archive.org/details/ernie1241_general
06:08 🔗 SketchCow Lots of goodies here
14:55 🔗 SketchCow Big ol' ops hug
14:55 🔗 SketchCow OK, so how are we doing, here?
14:55 🔗 SketchCow Posterous is slow but it's progressing along, we'll get what we get until the machines are gone.
14:56 🔗 SketchCow Formspring humming along - Formspring is not happy about this at all, but it's happening.
14:56 🔗 SketchCow URL Team fine, last I checked, new admin Chronomex
14:57 🔗 SketchCow I think we're currently stalled with Xanga. We really should not be.
14:57 🔗 SketchCow Are there other web downloads I'm missing?
14:58 🔗 Smiley IGN/Gamespot finished afaik
14:59 🔗 Smiley I've uploaded all my warcs. They are currently sitting on my account.
14:59 🔗 Smiley Pouet is still going.
14:59 🔗 SketchCow Need help putting them somewhere?
14:59 🔗 Smiley Well they are all on IA. I can't do anything I believe?
14:59 🔗 Smiley I'd love to have them go somewhere sensible.
14:59 🔗 SketchCow Oh, sitting on your IA account!
14:59 🔗 Smiley Yah, you've been busy so I didn't want to keep chasing :)
15:00 🔗 Smiley Also theres 50+ newamerica warcs I believe.
15:00 🔗 Smiley at 2Gb each.
15:00 🔗 SketchCow I am going through piles of uploads like that.
15:00 🔗 Smiley k, hit my account at some point, then if you let me know, I'll tweet my pride :D
15:00 🔗 SketchCow Ha ha Yes.
15:01 🔗 SketchCow pm me the e-mail account. I'll go through it
15:01 🔗 SketchCow Oh yes, one of my favorites.
15:02 🔗 SketchCow Oh, you HAVE been busy
15:02 🔗 Smiley rw-r--r-- 1 tim.bowers games 626M Jun 11 16:01 ./bin/ign/storage/pouet/pouet.net_06052013.cdx
15:02 🔗 Smiley -rw-r--r-- 1 tim.bowers games 57G Jun 11 16:01 ./bin/ign/storage/pouet/pouet.net_06052013.warc
15:02 🔗 Smiley Still going.
15:02 🔗 SketchCow Dude, pouet is large. Do you realize how large?
15:02 🔗 Smiley SketchCow: yah, 5 weeks signed off work will do that to a guy XD
15:02 🔗 Smiley though I did it all in about 2 weeks.
15:02 🔗 Smiley After that, kind of ran out of things to grab XD
15:03 🔗 Smiley SketchCow: Like I care... it'll keep going until it explodes :D
15:03 🔗 Smiley Unless it tops out 1.5Tb, then we have a problem D:
15:07 🔗 SketchCow OK, they're swapped over.
15:07 🔗 ivan` greader-grab can really start going once I strip out the 404s
15:07 🔗 SketchCow http://archive.org/details/archiveteam_ignsites will populate sooner rather than later.
15:07 🔗 Smiley yey
15:09 🔗 SketchCow There, it populated.
15:12 🔗 Smiley \o/
15:12 🔗 godane1 you got the comment pages: https://archive.org/details/www.g4tv.com-thefeed-comments-pages-20130610
15:12 🔗 Smiley SketchCow: omf_ also has done some grabs but I don't know if he's uploaded them
15:15 🔗 DFJustin so what was the deal with formspring not really shutting down
15:16 🔗 Smiley FAME \o/
15:16 🔗 Smiley DFJustin: someone brought them
15:16 🔗 SketchCow Yeah
15:16 🔗 Smiley ort threw money at them or something.
15:17 🔗 SketchCow But they did it really stupid under the radar.
15:17 🔗 Smiley But we all know how money runs out.... <except at yahoo, where they seem to shit it out o_O>
15:17 🔗 SketchCow Showing, of course, they were inclined to be non transparent the whole way.
15:17 🔗 GLaDOS Yahoo shits out money, then loves watching the reaction of people when they shut something down.
15:18 🔗 GLaDOS They MUST be doing it for the reaction at this point.
15:18 🔗 godane1 SketchCow: this one shouldnt' be in that collection: http://archive.org/details/newamerica.net
15:18 🔗 Smiley godane1: good catch, was about to go looking for that
15:18 🔗 Smiley Sadly it's incomplete, but it OOM'ed a 12Gb box
15:19 🔗 godane1 thats ok
15:19 🔗 godane1 this was just a panic run anyways
15:19 🔗 godane1 in cause if they do shutdown
15:23 🔗 SketchCow Yeah, they're going to end up being 80tb
15:24 🔗 godane1 wow
15:24 🔗 godane1 i figure maybe 1TB at the most
15:26 🔗 SketchCow I mean Formspring.
15:26 🔗 godane1 oh
15:30 🔗 SketchCow Uploading hundreds of issues of Manga, millions of pages.
15:30 🔗 godane1 you mean the stuff anime is based on?
15:31 🔗 SketchCow Sometimes it's based on that.
15:31 🔗 SketchCow I'm doing it by hand with lots of script help - adding a new manga issue every 3 seconds.
15:32 🔗 SketchCow http://archive.org/details/manga_library
15:32 🔗 Smiley oh that reminds me
15:33 🔗 Smiley that guy who has a pile of manga and other "adult" japanese scans hasn't come back to me
15:34 🔗 SketchCow I'm not toooo worried these will disappear
15:35 🔗 DFJustin hentai encyclopedia would be a good thing to dump onto ia at some point
15:39 🔗 DFJustin also a lot of these would benefit from page-progression=rl
15:40 🔗 antomatic Breaking: Greek public broadcaster ERT is being closed down tonight - website at www.ert.gr (in greek, obviously)
15:41 🔗 GLaDOS YEAH ALRIGHT LETS DO THIS
15:42 🔗 SketchCow Yeah, get on that shit.
15:42 🔗 SketchCow WARC 'er up
15:44 🔗 GLaDOS LEEEEEEEEEEROOOOOOOY
15:44 🔗 GLaDOS JEEEEEENKIIIIIIIIIIINS
15:46 🔗 GLaDOS Anarchive is happily slurping away at it on a root tmux session.
15:46 🔗 Tephra_ what's the best way to archive a site like that? just wget --mirror ?
15:47 🔗 GLaDOS There's a few generic commands at www.archiveteam.org/index.php?title=User:Djsmiley2k#Generic_Wget_command (need to get to moving it over to an article)
15:49 🔗 Tephra_ nice thanks, I have 100 bits/s down and an unused laptop that i can bring on the site
15:50 🔗 Smiley 100 bit's down o_O
15:50 🔗 Tephra_ sweden! ftw
15:50 🔗 GLaDOS mbits*?
15:50 🔗 SketchCow Walk Tephra_ through the WARC wget
15:51 🔗 Tephra_ well we pay for 100 down but usally get 60-70
15:54 🔗 Tephra_ should i just use the generic command onDjsmileys page?
15:54 🔗 GLaDOS That's all that I'm doing.
15:55 🔗 GLaDOS (I suck at using wget)
15:55 🔗 SketchCow Yes, that's a good one for a panic download, Tephra_
15:56 🔗 Tephra_ SketchCow GLaDOS I'm starting it now, will report when its sucked
15:56 🔗 Tephra_ everything
15:57 🔗 SketchCow Thanks
16:17 🔗 Nemo_bis http://www.archiveteam.org/index.php?title=Main_Page has a wrong link http://www.archiveteam.org/index.php?tile=Formspring
16:17 🔗 GLaDOS ..and we/someone took ert.gr down.
16:18 🔗 Nemo_bis needs to be corrected to http://www.archiveteam.org/index.php?title=Formspring or better [[Formspring|etc.]]
16:18 🔗 GLaDOS My bad.
16:18 🔗 Nemo_bis GLaDOS: Tephra was punished by being kicked off by IRC for 3 min, it seems
16:19 🔗 GLaDOS Heh
16:19 🔗 antomatic seems OK here
16:19 🔗 GLaDOS Yeah, came back up.
16:19 🔗 antomatic phew
16:21 🔗 antomatic Apparently the staff are planning to keep on broadcasting and are guarding the headquarters.
16:21 🔗 antomatic lots happening.
16:42 🔗 Tephra SketchCow: getting lots of "can't write to ert.gr/xx/xxx/yy/yyy/blah-blah-blah (not a directory)
16:43 🔗 Tephra seems like wget didn't create a correct directory structure or something
16:44 🔗 Cowering now maybe jason can do something about this :) http://www.foxnews.com/tech/2013/06/11/scientists-searching-for-world-first-web-page-turn-to-north-carolina/
18:19 🔗 SketchCow Internet Archive's having some connectivity/s3 issues.
18:55 🔗 SketchCow https://twitter.com/textfiles/status/344528364129882114
19:04 🔗 Smiley lol
19:15 🔗 omf_ We won the 2013 National Digital Stewardship Alliance award for an Organization. https://twitter.com/the_idea_agency/status/344532491446657025 http://blogs.loc.gov/digitalpreservation/2013/06/and-the-winner-is-announcing-the-2013-ndsa-innovation-award-winners/
19:16 🔗 omf_ See there are awards for groups of loud assholes ;)
19:22 🔗 omf_ I for one am going to retweet that, maybe even leak a nice comment on the blog. Lets publicize our VICTORY!!1!
19:22 🔗 omf_ s/leak/leave/ # too much NSA
19:23 🔗 Cowering Jason, can you ask the NSA to borrow a copy of the Internet ?
19:25 🔗 omf_ I bet the NSA has the most pristine and complete copy of 4chan ever
21:39 🔗 SilSte Hi @ all
21:39 🔗 SilSte I'm having a little problem... today the warrior "died". It forgot the old state and restarted with new jobs. The VM did not reboot. The files were still on the disk...
21:40 🔗 SilSte it happened at least twice... more than 50gb of downloaded work gone...
21:40 🔗 SilSte is there any possibility to resume old threads?
21:40 🔗 SketchCow Which site are you downloading?
21:40 🔗 SilSte formspring
21:40 🔗 SketchCow Well, I'm sorry to hear it's lost.
21:40 🔗 SilSte but there where also a few posterous jobs
21:41 🔗 SketchCow Well, the answer is to go to #warrior and ask people if they want to help and to be patient.
21:41 🔗 SketchCow Neither Posterous or Formspring are too time critical.
21:41 🔗 SilSte there are no other jobs in the warrior atm ;-)
21:41 🔗 SilSte or are there things which should come first?
21:43 🔗 SketchCow No, you should go back to Formspring, but people in #warrior should be able to trace back what MIGHT have been your issue.
21:47 🔗 Smiley 400 claims?
21:48 🔗 SilSte me?
21:48 🔗 SilSte no about 65 are at work now...
21:49 🔗 SilSte perhaps 70
21:49 🔗 Smiley Come to other chanel -> #warrior
22:26 🔗 Tephra SketchCow: update for ert.gr, they seem to have shutdown the tv broadcast but the site is still up and kicking for now. I'll keep wget up and running through the nightm hopefully the site will be up long enough to grab a good portion of it, and hopefully I have solved the problem with wget not getting the directories right
22:29 🔗 SketchCow Thanks.
22:37 🔗 godane1 looks like i can upload again
22:37 🔗 godane1 :-D
22:53 🔗 SilSte Tephra: is it possible to help with ert.gr?

irclogger-viewer