#archiveteam 2011-12-20,Tue

↑back Search

Time Nickname Message
00:01 🔗 Nemo_bis They're working on mirroring text dumps. They're already something like 10 TB https://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps
00:01 🔗 Nemo_bis A mirror has been added recently, an rsync to a second server is happening right now, a new cluster has been ordered yesterday.
00:01 🔗 Nemo_bis So, things are moving a bit.
00:15 🔗 soultcer qls
00:15 🔗 soultcer Whoops, wrong tab
02:51 🔗 bsmith094 any ideas for how to scrape fanfiction.net
02:53 🔗 PatC bsmith094, didn't you get that already?
02:53 🔗 bsmith094 got the stoey id numbers, not the stories
05:56 🔗 yipdw heh, wow, www.naenara.com.kp uses client-side imagemaps
05:56 🔗 yipdw I haven't seen those in a long time
13:04 🔗 SketchCow The article is up
13:04 🔗 SketchCow (Tech Review)
13:04 🔗 SketchCow You can read it and decide if my concerns were accurate.
13:10 🔗 SketchCow http://www.technologyreview.com/article/39317/
13:12 🔗 PatC nice comic :)
13:34 🔗 ersi Hoh! Awesome
14:14 🔗 ersi haha, awesome quotes
14:23 🔗 Frigolit :]
14:25 🔗 ersi I did somehow expect Schwartz to totally go nuts on AT though
14:26 🔗 ersi dunno why, might have something to do with his personal site
15:34 🔗 Schbirid the images rock
15:49 🔗 chronomex shmmmm.
15:59 🔗 chronomex good morning fellas
17:17 🔗 yipdw huh
17:17 🔗 yipdw I wonder how they figured out some of the handles in this channel
17:17 🔗 yipdw most likely by looking at archiveteam.org and inferring
17:17 🔗 yipdw OR PERHAPS SOMEONE IN HERE IS A MOLE
17:18 🔗 yipdw also, I've got a WARC of naenara.com.kp; what's the easiest way to get that to IA? register and upload?
17:19 🔗 chronomex yep
17:19 🔗 chronomex is it in .warc?
17:20 🔗 yipdw yes
17:20 🔗 yipdw it's only 1.6 GB, gzipped
17:20 🔗 yipdw I've got a significant part of the North Korean internet on a USB pen drive
17:20 🔗 yipdw that's an awesome thought
17:21 🔗 chronomex rad.
17:23 🔗 chronomex I'd say upload it, then let info@archive.org know.
17:24 🔗 yipdw using http://www.archive.org/create/, I guess?
17:24 🔗 yipdw or is there a specialized upload point for WARCs?
17:24 🔗 yipdw that link was just the first I found
17:27 🔗 chronomex yes, that
17:30 🔗 yipdw "You appear to be using the Firefox browser.
17:30 🔗 yipdw The browser will only upload files of 2GB or less."
17:30 🔗 yipdw that's right
17:30 🔗 yipdw good thing that fits within those limits
17:32 🔗 yipdw heh
17:33 🔗 yipdw I understand how a browser can parse that, but I'm simultaneously amazed that it works
17:34 🔗 winr4r yipdw: that is beautiful
17:35 🔗 yipdw browsers must have some of the best backwards compatibility ever
17:57 🔗 yipdw alright, uploaded and notified
18:14 🔗 Coderjoe mm quirks mode
18:15 🔗 Coderjoe you can also create items through the s3 interface. each item is a bucket and the first file uploaded to the bucket creates it.
20:05 🔗 soultcer I like the TR article
20:18 🔗 bsmith094 which article?
20:23 🔗 Nemo_bis bsmith094, http://www.technologyreview.com/article/39317/
21:18 🔗 SketchCow Tjamls sp ,icj. yipdw
21:18 🔗 SketchCow Tjat
21:18 🔗 SketchCow Thanks so much yipdw
21:18 🔗 SketchCow That's the golden stuff.
21:20 🔗 yipdw np
21:20 🔗 yipdw I'll see what else I can get before they officially replace Dear Leader
21:32 🔗 SketchCow Excellent.
21:32 🔗 SketchCow Actually, I think the grandson is already dear leader.
21:35 🔗 yipdw oh, good point
21:35 🔗 yipdw ha, naenara updaetd
21:35 🔗 yipdw updated
21:35 🔗 yipdw might as well run the mirror again
21:38 🔗 SketchCow Obviously get the sets.
21:38 🔗 SketchCow Want a place to FTP?
21:40 🔗 yipdw I'm uploading to archive.org right now via their HTTP interface
21:40 🔗 yipdw FTP would probably be nicer, though
21:40 🔗 yipdw though I guess I can also use the S3-alike
21:43 🔗 SketchCow Get a free account
21:43 🔗 SketchCow and you can upload via FTP
21:43 🔗 SketchCow I can help with that.
21:46 🔗 yipdw SketchCow: got the account
21:46 🔗 yipdw er, I mean, I have an account
21:46 🔗 yipdw brb
22:52 🔗 chronomex hm. I still hate magtape.
22:58 🔗 chronomex those fuckers in the 70s
22:59 🔗 chronomex the oxide is glued to the tape with a urethane compound, which gets gummy over the course of about 10 years
22:59 🔗 Coderjoe whee
22:59 🔗 chronomex you can fix it by baking the thing at 135-140F
22:59 🔗 Coderjoe and warp the base
23:00 🔗 chronomex less than ~135 won't do anything, more than 140 will cause printthrough
23:00 🔗 SketchCow I am helping negotiate the possible transfer of something like 135,000 tapes
23:00 🔗 chronomex the base of my carts is a 2mm aluminum slab
23:00 🔗 SketchCow Isn't that exciting.
23:00 🔗 chronomex ooh, what of?
23:00 🔗 chronomex and what type of tapes?
23:00 🔗 RedType http://www.sfgate.com/cgi-bin/article.cgi?f=/n/a/2011/12/20/national/a065958S85.DTL
23:00 🔗 SketchCow It might be 35,000, someone might have typod.
23:01 🔗 SketchCow Reel to reel of some sort
23:01 🔗 RedType They talked about Base64, a program that compresses digital documents for speedy transmission by removing all the spaces and punctuation marks.
23:01 🔗 RedType :|
23:01 🔗 SketchCow Everything the Christian Science Monitor recorded for radio, ever
23:01 🔗 chronomex ah. i wish these were reel tapes, that would solve some problems :|
23:01 🔗 chronomex wow
23:01 🔗 Coderjoe mmm... so one side of the tape gets extra crispy while the other stays original recipie
23:01 🔗 chronomex Coderjoe: ?
23:01 🔗 chronomex oh
23:01 🔗 SketchCow They're being digitized, we're just discussing having the original tapes.
23:02 🔗 chronomex cool
23:02 🔗 chronomex I hate tape *cartridges*.
23:02 🔗 Coderjoe chronomex: the 2mm aluminum slab. it will cook one side of the tape more than the other
23:03 🔗 chronomex almost to the point where i'm going to pay someone to do this for me
23:03 🔗 chronomex Coderjoe: Ah. I see. No.
23:03 🔗 chronomex tape is at right angle to the base
23:04 🔗 Coderjoe duh
23:04 🔗 chronomex well maybe trks 0 and 1 vs 2 and 3
23:04 🔗 Coderjoe side being EDGE not flat
23:05 🔗 chronomex who ever talks about datatape that way :P
23:06 🔗 SketchCow Anyone feel like typing in Compute! tables of contents?
23:08 🔗 bsmith095 archiveteam retruns this site may be compromised on google
23:09 🔗 SketchCow Gotta fixz that.
23:13 🔗 bsmith095 anything reasonably sized that needs downloading?, not mobileme, too big, i couldnt put a start of a dent in that
23:15 🔗 yipdw goddamnit, I sent ^C to the wrong terminal
23:15 🔗 yipdw I hate when I kill a process that's been running for an hour or so
23:15 🔗 yipdw at least it's idempotent
23:21 🔗 chronomex worse still when it's 80% done with a two-week job.
23:22 🔗 yipdw I haven't done that before
23:22 🔗 chronomex it sucks hard
23:23 🔗 Coderjoe even worse is when things get OOM-killed
23:23 🔗 SketchCow I'm still, STILL adding Jamendo.
23:33 🔗 chronomex yeah. oom is the pits.
23:45 🔗 bsmith095 did thingiverse ever finish up, or is thta still open?
23:48 🔗 SketchCow We did a full round
23:48 🔗 SketchCow We'll do another round at some point.

irclogger-viewer