[02:37] anyone do any book/manual scanning? [03:04] Silent700: I do a lot [03:04] coo [03:04] just looking for tips on doing manuals...most I am unstapling, running through the duplex scanner [03:04] I end up with two facing pages [03:05] using a batch mode in Irfanview now to split them down the middle [03:05] then the tedious task of assembling them in order in Acrobat [03:07] Oh, I just use xsane [03:07] It does all the magic for me [03:07] I just feed it 200 sheets at a time [03:07] I assume you're using Windows though [03:09] yeah, tho linux is an option [03:09] what does xsane do for you in that case? [03:09] auto-split the page? [03:09] or even more magic? [03:12] Auto rotate and creates a pdf [03:12] Then I split the pages in half with pdftk [03:13] (but you have to be precise with your scanning position) [03:14] hm [03:14] actually, let me amend the bit about Irfanview [03:14] I found a better way there - crop in Acrobat, export, crop to the other half, export [03:14] either way, end up with a bunch of TIFFs completely out of order [03:15] extremely slickness would involve OCRing those TIFFs, detecting page # and auto-sorting :) [03:17] Prone to breakage, though [03:17] majorly [03:21] underscor: what do you use for physically scanning? [03:21] Canon MX870 [03:22] Best feed tray I've ever used [03:22] 's not that expensive either [03:22] It's not a simultaneous duplexer, but it works wonderfully [03:22] for things that are curled, crinkly, etc [03:23] and it has pretty mible pickup [03:23] infallible* [03:23] Don't know how that became mible, haha [03:23] They have a decent linux driver too, but it's ubuntu only [03:23] (Well, ootb ubuntu only. I'm sure you can get the ppd out of it somehow [03:24] Fujitsu M4097D here [03:24] Grey only, tho [03:24] Oh, that's a dedicated scanning unit [03:25] it was pretty srs in its time [03:25] kind of outdated now, but I can crank through ringbound manuals with it quite nicely [03:26] I've been using the Xerox scanners at work for color/nicer stuff [03:26] brochures, etc [03:26] I'm considering building a scanner for digitizing textbooks at some point [03:27] Seeing as I really don't feel like slicing though the spince [03:27] a page-flipper? [03:27] *spine [03:27] Yeah [03:27] Unless there's an easy way to re-bind a book [03:27] automated page flipping is prone to failure [03:28] I have done a couple throw-away books by peeling the spine and manually de-binding [03:28] That's why the archive pays people to stand there and do it [03:28] which...sux [03:28] http://i.imgur.com/Chlin.png Sadness :( [03:29] "Admin attention needed" [03:34] you broke derive [03:34] I resubmitted a doc a couple nights ago that said that [03:34] it eventually got processed [03:35] Silent700: what ya doing here? [03:36] learnin' about archivin' [03:37] ha, ok [03:37] so I just received a set of CD images -- the complete WWDC 1997 set [03:37] it contains videos in .rm format and presentations in adobe persuasion format [03:38] Adobe Persuasion? [03:38] Never heard of that [03:39] it failed to persuade you [03:39] I dug up a copy of 4.0 prerelease for Mac OS classic [03:39] https://www.adobe.com/products/adobemag/archive/pdfs/9701qapn.pdf for some info on it [03:39] apparently 4.0 can export to PDF. [03:40] also: [03:40] someone should mention discferret ( www.discferret.com ) on http://www.archiveteam.org/index.php?title=Rescuing_Floppy_Disks [03:41] it is more powerful than pretty much everything else out there and can handle MFM *hard drives* [03:41] unfortunately the software needs some work, but that's in progress :) [05:13] Back. [05:19] Dude, Discferret is vaporware. [05:19] Oh sure, there's a board, and some smatterings of programatic advancement. [05:20] But you can't buy one. [05:20] You buy the bare board and a unicorn fart and both are interchangeable. [05:21] Also, kudos on no pictures on the for sale page. Super sexy. [05:21] Very, you know... complete-feeling. [05:22] You'll also notice it doesn't have an entry for the catweasel, another piece of technerd garbage [05:43] weasel... ferret... i sense a theme [05:45] how about harddrive hedgehog? [05:54] Well, yes, that was the point. [05:54] They didn't like the catweasel, so they wanted their own overpromising tech [05:57] well anyway, if you have problems with disc ferret you can bug balrog about it, i think he knows the guy who's working on that project or something [06:19] I'm sure he does. [06:20] anyway, are you really the textfiles.com guy? [06:20] what about the Kryoflux? Also vapor? [06:20] arbin: yes, SketchCow is Jason Scott. [06:21] oh, cool [06:21] I bought a Kryoflux. [06:21] I think it's waiting at the post office, I got a yellow slip [06:21] I might need to sign something [06:21] I got mine recently [06:21] imaged a couple Amiga disks with it, so that works [06:22] SketchCow: when did you order? [06:22] and now it's on the floor somewhere near my left foot [06:22] I don't know [06:22] Somewhere back [06:22] P.S. 400 Floppies arrived [06:32] SketchCow, what sort of floppies? How long does it take to read one? [06:32] Floppy Disks. People are sending me floppies because I said I'd read them. [06:33] Ah, ok. [06:33] Thought it could be some special collection donation. :-) [06:35] You need a floppy ADF [06:35] it is called an autoloader [06:35] i have one for 3.5 [06:37] i still need to script up the process and find my camera to snap the labels. and my kryoflux needs to arrive [06:38] hm http://www.professionalequipment.it/httpdocs/italiano/prodotti/diskette.jpg [06:39] http://www.archive.org/search.php?query=collection%3Acdbbsarchive&sort=-publicdate has gotten a ton more items. [06:39] It's about to get MANY more. [06:40] * Nemo_bis feels the need for isoview.php [06:44] 226 60.236 seconds (measured here), 33.93 Mbytes per second [06:44] 226-File successfully transferred [06:44] 2143243772 bytes sent in 60.22 secs (34755.4 kB/s) [06:44] Ponder that one for a moment. [06:45] i pondered. [06:46] It's so unfair, underscor has to upload 18500 items at 10 Mb/s [06:46] :-p [06:47] ah that's a ftp transaction? [06:47] new 226 looked familiar [06:49] knew [06:50] http://www.archive.org/details/20081201-TheWell&reCache=1 [08:05] hm, are those on IA? http://onlinebooks.library.upenn.edu/webbin/serial?id=philtransactions (via http://blogs.law.harvard.edu/sj/2011/07/24/aaron-swartz-v-united-states/ ) [08:15] For years, IA has been absorbing google books [08:15] So no worries [08:16] most of them seem to be on Hathi Trust [08:17] and Gallica! [08:17] Yes [08:17] lol, the French make the Queen's research open [08:18] This whole situation is so dreary and dull I am already quite bored with it. [08:25] I'm uploading shareware CD-ROMs, that feels much more satisfying. [08:27] Oh, well, I'm uploading some hundreds GiB of pageviews statistics gzips [08:28] And that's very satisfying as well if I think how it was like to upload them via FTP [08:28] Trying to prove a "point" with locked journal scans is kind of uninteresting. [08:29] I disagree. [08:29] But there's no need to agree. :-p [08:29] If I wanted to collect all those Philosophical Transactions of IA I'd love to be able to at least add a keyword to all items that exist already. [08:30] Eventually, I'm sure they'll be a collection. [08:30] This is something I often would like to do, it's very boring that only the original uploader can modify metadata. [08:30] But still you'd want to distinguish individual papers from complete volumes [08:30] boring's not a good word there [08:30] You want to wikify it. [08:30] That's how you end up with shit. [08:31] Not necessarily, you lack even a "send feedback" feature. :-p [08:31] *we [08:31] info@archive.org [08:31] That doesn't scale [08:31] Yes, but it exists. [08:31] Oh, so I can send huge lists of items with metadata corrections? [08:32] "There's a typo here" [08:32] "This 1200 items need this keyword" [08:32] You could. [08:32] Also, be clear: I am not archive.org [08:32] yes, I know [08:33] Over the years, a number of improvements and changes have been made, allowing various things to work better. [08:34] I'm sure if you went ahead and asked for more cogent metadata corrections, some version of that feature might show up. [08:34] Then again, that means someone vets the corrections. [08:34] that means money [08:34] And time [08:34] yes [08:34] Also, it means bringing the horror of Wikipedia's approach to IA [08:35] Next, let's have people vote-not-vote on what items should be deleted. [08:35] That'll be awesome [08:35] Also, hats [08:35] We need hats [08:35] http://static6.businessinsider.com/image/4cc6d8f9cadcbb6b17050000-400-300/sockington-jason-scott-has-1493274-followers.jpg [08:36] OK, enough CD-ROMs for tonight [08:37] I added 55 today [08:37] not too shabby [08:37] S3 is very effective for injections. [08:40] Also, puts cd.textfiles.com at 3.6 million files [08:40] I suspect by the time this is over, it'll hit past 4 million [08:40] Almost all of them an animated GIF of a bird [08:40] If Archive Team had a mascot, it'd be Yelling Bird [08:42] linky? [08:42] http://www.indietits.com/comics/blogs.png [08:42] http://www.indietits.com/comics/return.png [08:43] Yelling Bird's favorite word is "Shitcock" [08:44] http://questionablecontent.net/comics/1456.png [08:52] Victims of sexual abuse by priests will no longer be able to sue the Catholic church for damages if a landmark judgment rules that priests should not be considered as employees. [08:52] In a little publicised case heard this month at the high court, the church claimed that it is not "vicariously liable" for priests' actions. The church has employed the argument in the past but this was the first time it had been used in open court and a ruling in the church's favour would set a legal precedent. [08:52] Oh, that one's rich [08:59] that's ridiculous. [08:59] of course they're employees! [09:00] they employ arguments but not priests? [09:02] About the whole metadata debacle; I'd suggest (If someone were interested in hearing it) a "suggest metadata change". It doesn't have to be moderated instantly, maybe it could be viewable from some seperate tab on the asset and be in a queue for moderation. [09:02] I have no idea who would have time to moderate that though. Maybe some ad-hoc trusted-peer IA metadata warrior group. [09:04] Another possibility is a suggestion thing, which I could then inject changes with, as a test. [09:04] Isn't that what I wrote? :O [09:05] Where what. [09:05] Just now, above [09:07] Sorry, getting tired [09:07] I meant my doing it in my own environment instead of bothering IA developers [09:10] Ah, gotcha [09:25] Damn, adding these is like potato chips [09:25] The script does 99% of the work [09:26] Half the time, I'm going "yeah" [09:28] When I started this on Friday, it had 72 CD-ROMs. [09:28] Now we have 300. [09:38] Awesome [09:38] That's a job well done [10:26] http://26.media.tumblr.com/tumblr_lmioz5dvD21qbtj0jo1_500.jpg [14:57] See, in the US, you just don't get those. [15:16] more hungarian c64 magazine covers http://lion.xaraya.hu/mags/c67/ [15:16] http://lion.xaraya.hu/images/magimages/c_ujsag_l/commodore-ujsag-1986-03.jpg [15:16] awesome stuff :) [17:13] http://theinfo.org [17:14] You know. I always post good shit. [17:14] cool [17:17] My KryoFlux came i the mail. [17:17] Kind of nice, I ordered something, and it showed up. [17:17] Unlike, say, the DiscFerret [17:19] :) [17:22] By the way http://act.demandprogress.org/sign/support_aaron/ [17:22] kryoflux gets some shit because sps have been very slow to document the IPF format which is retarded for a preservation organization, but there is an awful lot to be said for a product that's shipping [17:24] I'm in the mix now. [17:25] It'll go better and faster. [17:25] Even if I have throw some of you maniacs headlong into it [20:53] We're combining the Archive Team Google Video download with archive.org [20:53] So far, archive team has found 13,000 videos archive.org hasn't. [20:54] Fuck yeah! [20:54] High-fives all around [20:55] damn nice [21:58] nice [23:49] SketchCow: Do you have info on alard's WARC wget? [23:51] :) @ underscor's new philtrans [23:52] :D