#archiveteam 2012-10-03,Wed

โ†‘back Search

Time Nickname Message
02:51 ๐Ÿ”— SketchCow The bitsavers mirror I am doing is down to the w's in pdf.
02:51 ๐Ÿ”— SketchCow Only took it 5 days.
06:51 ๐Ÿ”— SketchCow alard: Can I go ahead and archive out Cinch and put City of Heroes on its own directory?
07:17 ๐Ÿ”— SketchCow Fun Trivia: Currently, Fortress of Solitude has 7.4tb of space free.
07:18 ๐Ÿ”— chronomex :D
07:18 ๐Ÿ”— chronomex git annex get */*
07:24 ๐Ÿ”— SketchCow http://archive.org/details/wikileaksarchive_20100522
07:24 ๐Ÿ”— SketchCow bwa ha ha, well, that can't go south
07:25 ๐Ÿ”— chronomex :D
07:25 ๐Ÿ”— SmileyG I think I finally found ap roject no one else does
07:26 ๐Ÿ”— SmileyG Redfish Magazine
07:28 ๐Ÿ”— SmileyG damn slow site due to it being in US tho.
07:29 ๐Ÿ”— SmileyG http://redfishmagazine.com.au/
07:34 ๐Ÿ”— SmileyG err in Au even ;D
07:37 ๐Ÿ”— SketchCow I could do that magazine in under 20 minutes, I think.
07:38 ๐Ÿ”— SketchCow Grab them, we can put them on archive.org.
07:39 ๐Ÿ”— SmileyG i'm doing em now :P
07:39 ๐Ÿ”— SmileyG Its something I can manage \o/
07:39 ๐Ÿ”— SmileyG It's asking me which collection to put it in.... they are PDF text so... text, but.... Community texts?
07:39 ๐Ÿ”— alard SketchCow: Yes, you can remove City of Heroes from the Cinch directory. It's not being used at the moment. However, we might want to do another run of the CoH boards before it closes, so it would be useful to have a different rsync upload space if you close Cinch. (Perhaps a reusable "warrior" rsync thing?)
07:40 ๐Ÿ”— SmileyG (yes I suck and have never done this before :( )
07:41 ๐Ÿ”— chronomex everyone learns sometime
07:41 ๐Ÿ”— chronomex well, some people never learn
07:41 ๐Ÿ”— SmileyG :D
07:42 ๐Ÿ”— SmileyG such as which licence do I choose etc :<
07:47 ๐Ÿ”— SketchCow Do your best, Smiley. I'll fix everything soon.
07:47 ๐Ÿ”— SketchCow When do CoH close?
07:48 ๐Ÿ”— SmileyG SketchCow: ok thanks :D
07:49 ๐Ÿ”— SmileyG I could write a bash script to get the different versions, but its almost as quick for me to simply type in the browser :D
07:50 ๐Ÿ”— SmileyG Got all the EU ones to date so far, just waiting for the collection to update to ensure I'm doing this right ;D
08:06 ๐Ÿ”— alard "The City of Heroesร‚ยฎ servers will shut off on November 30, 2012" (although it doesn't mention the boards specifically)
08:17 ๐Ÿ”— SmileyG Can you turn on the CoH warrior job?
08:20 ๐Ÿ”— alard Is it time yet? (We've already got a full copy, just not the recent posts.)
08:21 ๐Ÿ”— SketchCow http://bitsavers.trailing-edge.com/www.computer.museum.uq.edu.au/
08:22 ๐Ÿ”— SmileyG alard: I dunno, it was a question of if it's "that simple".
08:23 ๐Ÿ”— alard Yes, it's not hard. I have to clear the current 'done' list, then queue the ids again. (And I've just said goodbye to the upload space on fos, so that must be fixed first.)
08:23 ๐Ÿ”— SketchCow http://bitsavers.trailing-edge.com/www.computer.museum.uq.edu.au/newsletters/
08:24 ๐Ÿ”— SketchCow Why look, something I'm going to shove into archive.org
08:24 ๐Ÿ”— chronomex WHY LOOK
08:25 ๐Ÿ”— SmileyG server readonly -- tasks waiting for harddrive fix ???
08:25 ๐Ÿ”— SmileyG Also, what is the "submit time" ?
08:25 ๐Ÿ”— chronomex time the task was kicked off
08:26 ๐Ÿ”— SmileyG TheSienaNewsJune161948(2.1 years)xxx@xxxxx.xxx-1waiting
08:26 ๐Ÿ”— SmileyG lulz? 2.1 years ago?
08:26 ๐Ÿ”— chronomex sounds old
08:27 ๐Ÿ”— SmileyG theres another for 2.4 years
08:38 ๐Ÿ”— chronomex hm, I have a request for help finding data in focity
08:38 ๐Ÿ”— chronomex anyone remember how we were doing that?
08:39 ๐Ÿ”— SmileyG focity, fortune city?
08:39 ๐Ÿ”— SmileyG I'd only just joined when that was going on. No idea sry :<
08:40 ๐Ÿ”— chronomex right, fortunecity
08:41 ๐Ÿ”— alard http://archive.org/download/test-memac-index-test/fortunecity.html
08:42 ๐Ÿ”— chronomex thanks
08:42 ๐Ÿ”— alard That's only for the new username-style sites, not for the ancient area/street/number structure.
08:43 ๐Ÿ”— chronomex hmmm, she's actually looking for anissa.myblogsite.com
08:43 ๐Ÿ”— chronomex myblogsite.com was a fortunecity service, did we suck that in?
08:44 ๐Ÿ”— alard No, I don't think so. This is the first time I hear of that.
08:44 ๐Ÿ”— chronomex damn.
08:44 ๐Ÿ”— chronomex ok
08:44 ๐Ÿ”— alard We only have things in *.fortunecity.*
08:45 ๐Ÿ”— alard (They apparently also had MyPhotoAlbum.com.)
08:46 ๐Ÿ”— chronomex nice name too
08:46 ๐Ÿ”— chronomex ...
09:09 ๐Ÿ”— Nemo_bis ah, wonderful! TASK FAILED AT UTC: 2012-10-02 18:11:37 http://www.us.archive.org/log_show.php?task_id=124346847
09:09 ๐Ÿ”— Nemo_bis nice /usr/local/petabox/deriver/derive.php /var/tmp/autoclean/derive/EB1911WMF 'task_id=124346847&identifier=EB1911WMF&server=iw600306.us.archive.org&cmd=derive.php&args=dir%3D%252F12%252Fitems%252FEB1911WMF%26prevtask%3D124346467%26server_primary%3Dia601202.us.archive.org&submittime=2012-09-22+14%3A27%3A06&submitter=federicoleva%40tiscali.it&priority=-6&wait_admin=0&finished=0' failed with exit code: 9
11:11 ๐Ÿ”— SmileyG errr,
11:11 ๐Ÿ”— SmileyG when I select Public Domain Mark it kills firefox :/
11:16 ๐Ÿ”— SmileyG added a film that Jason references in one of his talks which is appently public domain but not on the Archive - Jim Henson's "The Cube".
11:16 ๐Ÿ”— SmileyG except it doesn't show under my uploads :<
11:28 ๐Ÿ”— SmileyG http://archive.org/details/TheCube-JimHenson-1969 worked the second time :S
14:16 ๐Ÿ”— DFJustin SketchCow: add that to the http://archive.org/details/wikileaksarchive collection?
14:23 ๐Ÿ”— SmileyG Aurgh now the cube appears twice! Failure!
14:41 ๐Ÿ”— SmileyG https://archive.org/details/groks209 its silent o_O
17:45 ๐Ÿ”— SketchCow DO NOT take this the wrong way as I am absolutely sure your heart.s in the right place,
17:45 ๐Ÿ”— SketchCow Look, Jase ..
17:45 ๐Ÿ”— SketchCow but judt why, exactly are you doing this?
17:45 ๐Ÿ”— SketchCow I mean, I can see the computer magazines & everything being done . the .Computer Gaming World. PDF archive is a gift from God itself! . but I.ll wager 99.9% of the books you.re Scribe-ing will never ever be looked at.
17:45 ๐Ÿ”— SketchCow And all of the important works I.m sure.s been covered already by the Gutenberg Project.
17:46 ๐Ÿ”— SketchCow Is .Just Knowing It.s There & Available. really a good enough reason to be doing all of this incredibly difficult work ..?
17:46 ๐Ÿ”— SketchCow ....
17:46 ๐Ÿ”— SketchCow This is the best comment, ever.
17:53 ๐Ÿ”— chronomex punctuation motherfucker
17:53 ๐Ÿ”— chronomex ~
17:54 ๐Ÿ”— Ymgve what kind of keyboard does that guy have
17:54 ๐Ÿ”— chronomex | tr "'," ".."
18:24 ๐Ÿ”— SketchCow alard: Are you around?
18:25 ๐Ÿ”— SketchCow Hey, so good news, everyone. Archive.org is now generating the files and beginning the process of a new wayback machine index.
18:26 ๐Ÿ”— SketchCow I'm tasked with helping us prepare the archiveteam uploads of the last year for inclusion into the Wayback.
18:26 ๐Ÿ”— SketchCow So we're going to need an inventory of the sites we've grabbed, which of our stuff is in WARC format.
18:43 ๐Ÿ”— alard SketchCow: Yes, somewhat.
18:43 ๐Ÿ”— alard That's good news.
18:48 ๐Ÿ”— alard Does this include the warc-in-tar stuff?
19:28 ๐Ÿ”— SketchCow Yes, I'm about to set up an inventory of all our projects, so we can pass it for testing
19:28 ๐Ÿ”— SketchCow Most wil be fine, SOME might need to be rejiggered in some way.
19:46 ๐Ÿ”— DFJustin SketchCow: FUCK YEAH
19:46 ๐Ÿ”— chronomex :D
19:48 ๐Ÿ”— Nemo_bis Without the MobileMe stuff, right?
20:29 ๐Ÿ”— lemonkey this link is now sorta old but anyway... http://h30565.www3.hp.com/t5/Feature-Articles/The-History-of-the-Floppy-Disk/ba-p/6434
21:35 ๐Ÿ”— SketchCow The mobileme stuff is a separate project for now.
21:35 ๐Ÿ”— SketchCow MAYBE it goes in.
21:36 ๐Ÿ”— SketchCow Archive.org is basically DOUBLING the amount of data coming in from the last crawl (which people have figured out, was early last year.)
21:36 ๐Ÿ”— SketchCow This is going to just be a massive ingestion of data, but then our stuff joins in.
22:13 ๐Ÿ”— SketchCow https://docs.google.com/spreadsheet/ccc?key=0ApQeH7pQrcBWdDZIUEVjR3d1UmRoU0lPSWZYX0Q1Ync
22:23 ๐Ÿ”— SketchCow So, that's me collecting all our items and collections that have .WARC files involved in them.
22:24 ๐Ÿ”— SketchCow So they'll go into the archive.org wayback this month!
22:24 ๐Ÿ”— SketchCow And the Wayback will jump up to six months ago.
22:24 ๐Ÿ”— chronomex yaaaay
22:24 ๐Ÿ”— chronomex is it some kind of horrendous semimanual batch update?
22:24 ๐Ÿ”— SketchCow yes.
22:24 ๐Ÿ”— SketchCow Oh, as horrible as you can imagine.
22:24 ๐Ÿ”— SketchCow The number thrown to me is that the DB has 168 billion rows.
22:25 ๐Ÿ”— SketchCow Wait, no.
22:25 ๐Ÿ”— SketchCow 168 million, sorry.
22:25 ๐Ÿ”— SketchCow Anyway, it jumps to 240 million after this single update.
22:25 ๐Ÿ”— chronomex that's a decent database
22:25 ๐Ÿ”— SketchCow So it's... significant
22:25 ๐Ÿ”— chronomex crap
22:25 ๐Ÿ”— SketchCow It may double the wayback
22:25 ๐Ÿ”— chronomex but the data is mostly from just recently?
22:27 ๐Ÿ”— DFJustin is godane's stuff going in
22:27 ๐Ÿ”— DFJustin I guess it's all too new
22:28 ๐Ÿ”— joepie91 SketchCow: don't forget devilskitchen ;)
22:31 ๐Ÿ”— SketchCow Already in there. Look again.
22:31 ๐Ÿ”— SketchCow I haven't browsed godane's stuff yet.
22:31 ๐Ÿ”— DFJustin reminder there's still 3 items kicking around in http://archive.org/details/archiveteam-mobileme
22:35 ๐Ÿ”— SketchCow Fixed.
22:36 ๐Ÿ”— SketchCow This is definitely where I need help - to find orphan items so we can clean it up, and then shove the right things into the crawl.
22:39 ๐Ÿ”— underscor <insert obligatory shoving things places joke>
22:40 ๐Ÿ”— DFJustin not fixed for http://archive.org/details/archiveteam-mobileme-hero-2511x
22:48 ๐Ÿ”— DFJustin friendster also absent
22:49 ๐Ÿ”— SketchCow Friendster isn't warc, as far as I can tell.
22:50 ๐Ÿ”— SketchCow Am I wrong?
22:50 ๐Ÿ”— DFJustin no idea
23:14 ๐Ÿ”— SketchCow Investigating.
23:32 ๐Ÿ”— SketchCow Just checked, Friendster is the last of the No-WARC saves.
23:32 ๐Ÿ”— SketchCow It was soon after that the request for WARC came.
23:46 ๐Ÿ”— closure also Friendster had the really crazy phantomjs ripper, since we needed javascript or something..
23:48 ๐Ÿ”— Coderjoe "why are you doing this?" uh, because otheriwise it doen't get done and winds up lost to the dust of time.
23:54 ๐Ÿ”— lolumad WHY DON'T I STRETCH OUT? AHAHAHA!
23:54 ๐Ÿ”— lolumad WHY DON'T I STRETCH OUT? AHAHAHA!
23:54 ๐Ÿ”— lolumad WHY DON'T I STRETCH OUT? AHAHAHA!
23:54 ๐Ÿ”— lolumad WHY DON'T I STRETCH OUT? AHAHAHA!
23:54 ๐Ÿ”— lolumad WHY DON'T I STRETCH OUT? AHAHAHA!
23:54 ๐Ÿ”— lolumad WHY DON'T I STRETCH OUT? AHAHAHA!
23:54 ๐Ÿ”— lolumad WHY DON'T I STRETCH OUT? AHAHAHA!
23:54 ๐Ÿ”— lolumad WHY DON'T I STRETCH OUT? AHAHAHA!

irclogger-viewer