#archiveteam 2011-09-24,Sat

โ†‘back Search

Time Nickname Message
01:27 ๐Ÿ”— dashcloud the new jstor liberator is really nice- and so is the website
01:34 ๐Ÿ”— Ymgve what's the jstor liberator
01:44 ๐Ÿ”— db48x2 it's a project that alard cooked up for downloading articles from JSTOR and liberating them
01:44 ๐Ÿ”— db48x2 don't have the url off-hand
01:50 ๐Ÿ”— Coderjoe [Sep 23 11 04:43] <alard> If you've lost the bookmarklet: http://severe-samurai-6114.heroku.com/
01:56 ๐Ÿ”— Ymgve so what does it do exactly? pick random documents from the old collection they've opened up?
01:58 ๐Ÿ”— db48x2 not random, exactly
01:58 ๐Ÿ”— db48x2 to the user it looks random
02:01 ๐Ÿ”— db48x2 the server is handing out documents that still need to be downloaded
02:05 ๐Ÿ”— chronomex looks like you need text, alard. shall I compose some for you?
02:20 ๐Ÿ”— SketchCow OK, so.
02:20 ๐Ÿ”— SketchCow I just did an experiment, I had a file misnamed in the friendster collection.
02:20 ๐Ÿ”— SketchCow FRIENDSTER-000000000 has a 40gb file I was calling 000-999
02:20 ๐Ÿ”— SketchCow No.
02:20 ๐Ÿ”— SketchCow It's 000000-340000
02:21 ๐Ÿ”— SketchCow So there's all those early files.
02:23 ๐Ÿ”— SketchCow I would claim that file is the most critical of all of them.
02:23 ๐Ÿ”— chronomex makes sense
02:30 ๐Ÿ”— SketchCow New laptop came in.
02:31 ๐Ÿ”— SketchCow Soon it'll be digitizing VHS tapes.
02:33 ๐Ÿ”— dashcloud alard: you should either flag or reject any PDFs that are 2kb in size (they can never be valid)- that's what you get if you just click liberate without having recently viewed a PDF
02:54 ๐Ÿ”— chronomex SketchCow: do you know if the archive.org deriver handle .tar.gz of images properly, or only .tar?
02:54 ๐Ÿ”— chronomex the documentation only says .tar, but my images are uncompressed so .gz is a good deal better
02:55 ๐Ÿ”— chronomex or would it be better to look into compressing my originals
03:17 ๐Ÿ”— SketchCow DOn't compress.
03:17 ๐Ÿ”— SketchCow Try _images.tar.gz
03:17 ๐Ÿ”— SketchCow or images_tar
03:17 ๐Ÿ”— SketchCow I mean _images.tar
03:59 ๐Ÿ”— chronomex okay, i suppose i will try that
03:59 ๐Ÿ”— chronomex i just want to avoid a huge put that doesnt derive
04:10 ๐Ÿ”— chronomex why not compress, exactly? I would only use lossless tiff compression that's supported universally, i.e. DEFLATE
05:04 ๐Ÿ”— * OGshoop slaps shoop around a bit with a large trout
05:24 ๐Ÿ”— chronomex SketchCow: hmm. _neither_ _images.tar nor _images.tar.gz derived properly.
05:24 ๐Ÿ”— chronomex _images.tar: http://www.us.archive.org/log_show.php?task_id=84691742
05:24 ๐Ÿ”— chronomex _images.tar.gz: http://www.us.archive.org/log_show.php?task_id=84690874
05:25 ๐Ÿ”— chronomex I didn't put the images in a directory; they're in the root of the _images.tar just like they are in the root of a _images.zip
05:47 ๐Ÿ”— SketchCow Car
05:47 ๐Ÿ”— SketchCow Gar
06:03 ๐Ÿ”— chronomex I'm going to look into creating a utility to DEFLATE-compress these tiff files in place before uploading, if you guys can't take _images.tar.gz
06:04 ๐Ÿ”— chronomex well, might get better compression that way in any case
06:06 ๐Ÿ”— DFJustin why not just _images.zip
06:07 ๐Ÿ”— chronomex because _images.zip has a 2G hard limit, and I have some scans that wind up being 5G
06:08 ๐Ÿ”— DFJustin :o
06:08 ๐Ÿ”— chronomex compression only gets rid of 40% usually
07:05 ๐Ÿ”— tux0 hello
07:12 ๐Ÿ”— tux0 think I read somewhere youtube has around 500 million videos
07:13 ๐Ÿ”— tux0 think it may have been their annual report to investors
07:15 ๐Ÿ”— tux0 that amount of data is just wow
07:17 ๐Ÿ”— tux0 anyways just checking in since the google video project :) be back later
07:33 ๐Ÿ”— db48x tux0: howdy :)
07:33 ๐Ÿ”— db48x and yes, 500 million videos is rather a large number
07:33 ๐Ÿ”— db48x we should start right away
08:54 ๐Ÿ”— alard chronomex: Thanks for the offer, but SketchCow is already thinking about text for the JSTOR toy.
08:56 ๐Ÿ”— alard dashclouds: The uploaded PDFs do indeed need checking. But did you mean that the bookmarklet uploads non-PDFs? That would be strange, since it's supposed to always download the PDF first.
08:57 ๐Ÿ”— alard Sorry, dashcloud without the s (if he's still here).
09:02 ๐Ÿ”— chronomex okay
09:47 ๐Ÿ”— SketchCow DERP
09:48 ๐Ÿ”— chronomex 20:16:40 <@SketchCow> DOn't compress.
09:48 ๐Ÿ”— chronomex why not?
09:49 ๐Ÿ”— SketchCow I apologize for that terse, unhelpful statment.
09:49 ๐Ÿ”— chronomex don't apologize, just explain
09:51 ๐Ÿ”— SketchCow Just had to set some uploads.
09:52 ๐Ÿ”— SketchCow In generally, it's better not to compress, and archive.org's derives will do all the right thing, and make new versions of everything.
09:52 ๐Ÿ”— SketchCow But that's not 100% guaranteed.
09:52 ๐Ÿ”— chronomex why's it better, exactly?
09:52 ๐Ÿ”— SketchCow Like, I upload .avi films whenever possible, and it then makes compressed .MPG, .OGG, .MP4 versions, etc.
09:52 ๐Ÿ”— chronomex that's lossy compression
09:52 ๐Ÿ”— SketchCow It adds a layer of complexity to the deriver.
09:53 ๐Ÿ”— SketchCow An alternate version, of course, is to derive them yourself and upload them, it'll deal.
09:53 ๐Ÿ”— SketchCow Right, lossy derivatives from your uncompressed original.
09:53 ๐Ÿ”— SketchCow As it always keeps the original, then all the options are there.
09:53 ๐Ÿ”— chronomex I'm taking the scanner's raw uncompressed tiffs and turning them into DEFLATE lossless-compressed tifs
09:53 ๐Ÿ”— chronomex is there something you think is wrong with that?
09:54 ๐Ÿ”— SketchCow No.
09:54 ๐Ÿ”— chronomex okay, good
09:54 ๐Ÿ”— chronomex because there isn't
09:54 ๐Ÿ”— SketchCow Ha ha
09:54 ๐Ÿ”— SketchCow OH REALLY
09:54 ๐Ÿ”— * SketchCow grabs bottle, smashes neck
09:54 ๐Ÿ”— SketchCow I maybe old but I can cut you
09:55 ๐Ÿ”— chronomex if you're worried about bitrot in the future, a DEFLATEd .tif stored in a .zip file is no better off than an uncompressed .tif DEFLATEd into a .zip file
09:55 ๐Ÿ”— chronomex but the former is a lot easier for me
09:55 ๐Ÿ”— SketchCow I'm not even a little worried about bitrot.
09:55 ๐Ÿ”— chronomex you may be old but your cat has a billion people read him on the internet
09:55 ๐Ÿ”— SketchCow Mostly, it's that nearly every other place on the internet takes your fatty file, makes lossy derivatives, then shows that and won't let you at the original.
09:56 ๐Ÿ”— chronomex fuck that shit
09:56 ๐Ÿ”— SketchCow At IA, it's oppostite, you can ALWAYS get to the original, and then it's touch and go what lossys get out.
09:56 ๐Ÿ”— SketchCow So I like to encourage that.
09:56 ๐Ÿ”— SketchCow But original can be whatever original you want, to taste.
09:56 ๐Ÿ”— SketchCow But I just try to wean/ward people off wrecking the file if they don't have to, since IA has the space and the will.
09:56 ๐Ÿ”— chronomex I'm compressing the originals because it makes it much faster to move around
09:57 ๐Ÿ”— * chronomex nods
09:57 ๐Ÿ”— chronomex ofc I checked that all the scanner metadata makes it through the compress process
09:57 ๐Ÿ”— chronomex dpi, model, etc
09:58 ๐Ÿ”— SketchCow Good deal.
09:58 ๐Ÿ”— chronomex I'm no dummy.
10:02 ๐Ÿ”— SketchCow I've decided I'm sick of people who, seeing the thousands of magazines I've uploaded, go "but he totally forgot XXXXX"
10:02 ๐Ÿ”— SketchCow Where XXXX is their random magazine they remember, vaguely.
10:02 ๐Ÿ”— SketchCow Never mind there's now more issues up than they could possibly read.
10:02 ๐Ÿ”— SketchCow Or that I might have more.
10:02 ๐Ÿ”— SketchCow Gift Horses!
10:03 ๐Ÿ”— chronomex yeahhhhhfuckoff
10:18 ๐Ÿ”— RedType Poland Magazine
10:29 ๐Ÿ”— SketchCow One guy who said this, I DID have the magazine he mentioned up... it just wasn't on the week-old list someone had posted in the forum.
12:19 ๐Ÿ”— alard SketchCow: Are you there? If so, I have a question for you about the server-side things of JSTOR.
16:23 ๐Ÿ”— Coderjoe SketchCow: it seems some people didn't learn this lesson as a kid: http://www.youtube.com/watch?v=wm-E7zkyCCA
16:51 ๐Ÿ”— Ymgve such a wasted chance to use "look a gift pony in the mouth"
18:17 ๐Ÿ”— SketchCow ALard, yes
18:29 ๐Ÿ”— SketchCow I have to step out to do some presentations/opening at a hacker space. E-mail me, I'll respond.
23:31 ๐Ÿ”— Paradoks join #foreveralone
23:31 ๐Ÿ”— Paradoks Oops.
23:37 ๐Ÿ”— Wyatt "...Iรขย€ย™ll be down at Cafe du Chapeau chowing down." This is the best way to say you'd eat your hat ever.

irclogger-viewer