[00:00] more seriously, I think we could actually use git-annex at my current workplace for versioning slide scans [00:00] or perhaps annotation data on that [00:00] slide scans are a few hundred megabytes each [00:01] that's kind of big [00:01] what sort of instrument do you use to scan? [00:01] they're massive, massive images [00:02] and what sort of originals? [00:02] I think [00:02] we're using a NanoZoomer 2.0 slide scanner here [00:03] on average, it's not quite as bad as a few hundred megabytes [00:03] but when you combine multiple focal planes with the highest image quality, yeah, it can get there [00:03] aha. [00:03] the idea is that you should capture enough data from the slide to permit diagnoses to be made from the capture [00:04] and the bar for that is "what a histologist can see through a microscope" [00:04] which is quite high :P [00:04] a friend and I have cobbled together a high-speed slide scanner, using a carousel slide projector + low wattage bulb + ground glass screen behind the slide, dslr where the projector lens normally lives [00:04] ahaha different slides :P [00:04] heh yeah [00:05] I forgot "slide" had another meaning [00:05] we've gotten really good results with a not-very-fancy camera [00:05] ah, here we go [00:05] http://sales.hamamatsu.com/en/products/system-division/virtual-microscopy/index.php?id=13222680 [00:05] maximum resolution is 0.23 micrometers/piel [00:05] pixel [00:05] yow [00:05] so for a 26x76mm slide, yeah [00:05] work that out [00:06] generally the person scanning the slide will select a region of interest so that they don't have to wait forfuckingever to get an image [01:42] http://torrentfreak.com/book-publishers-shut-down-library-nu-and-ifile-it-120215/ [01:42] According to the complaint, the sites offered users access to 400,000 e-books and made more than $11 million in revenue in the process. [01:42] See. [01:42] This is the thing. [01:43] It is so hard for me to go "OH NO A DIGITAL LIBRARY OF ALEXANDRIA IS GONE" [01:43] I work at one, thank you very much [01:43] i'm just wondering if that stuff is really gone :( [01:43] links that is [01:44] if library.nu at least comes back in some form one could crawl it, but this was crazy-sudden [01:44] but THINK OF THE CHILDREN [01:44] Oh I am [01:44] OK, finished [01:45] * SketchCow zips up [01:45] So, how long has git-annex been around? [01:45] I had someone drop this on me, so it's all new, but obviously it is rather mature. [01:45] I think two years? [01:45] maybe three [01:47] well their gitweb only goes back to 2010-10-09 [01:48] which is around the time articles about it started popping up [01:48] arrith: well that's 1/3rd the lifetime of git [01:48] so *a long time* suffices ;) [01:48] git-annex is not some flaky script that was quickly thrown together. I wrote it in Haskell because I wanted it to be solid and to compile down to a binary. And it has a fairly extensive test suite. (Don't be fooled by "make test" only showing a few dozen test cases; each test involves checking dozens to hundreds of assertions.) [01:48] from http://git-annex.branchable.com/not/ [01:48] the dev seems to be pretty capable is why i pasted that [01:49] that's about the time I first heard about it; I lurk on the vcs-home mailing list which is where it was announced iirc [01:50] arrith: I believe closure is joey hess, the author [01:50] oh wow, that's neat [01:51] 20 Oct 2010 is the earliest date in the debian changelog here: http://packages.debian.org/changelogs/pool/main/g/git-annex/git-annex_3.20120123/changelog [01:51] so yeah i'm going with late 2010 [01:51] seems reasonable [01:53] jesus 1/3 the lifetime of git?! [01:55] What the fuck kind of time measurement is that. [01:55] We'll be done with the project in .4 git-annex lifetimes [02:06] all of library.nu's actual content was hosted on other filehost sites [02:07] like megaupload [02:07] we're fucked [02:08] DFJustin: yeah, just need the links [02:08] almost like tpb and magnet links [02:08] but the alexandria comparisons are pretty silly because all of the stuff still exists in print in regular libraries [02:09] although there is some information in torrent files that won't be available through magnet links alone, tracker urls for example [02:16] http://pastebin.com/NhA3VPhK ... I'm just saying [02:34] https://plus.google.com/hangouts/extras/talk.google.com/jason's%2520incredibly%2520boring%2520clubhouse?authuser=0&hl=en&eid= [02:56] Blargh, I hate it when I have to repair my main desktop :( [02:56] At least this channel is publicly logged [02:59] closure: Been playing/learning about git-annex all evening [02:59] This is excellent! [02:59] Currently making a repo containing all archiveteam uploads on archive.org [03:00] (pointing to web remotes) [03:00] Mostly just to practice using it, but who knows, might be useful [03:01] closure: wrt "file sharing", isn't it just adding other people's remotes and vice versa? [03:01] (of course, they need to be trusted people since they need something like ssh access over git-annex-shell) [03:01] well yeah, basically [03:02] s/trusted/semi-trusted/ [03:02] you can also put git repos on http:// and no logins needed [03:02] or some other things [03:02] yeah [03:02] But if you put repos on http, how would you also distribute the files? [03:02] by http [03:02] in the same directory [03:03] (Since they're not actually wrapped in the git repo) [03:03] Oh, okay, just curious [03:03] they're in .git/annex/objects/ which is accessible via http if you put .git up for http [03:03] I'm only familiar with using git daemon to clone [03:04] So if I git clone a repo, I end up with all the files that are "in" that copy of it at that time? [03:04] no, it ends up empty, you have to get the files you want in that clone [03:05] So, how would you do that over http? [03:05] Just make .got/annex/objects web accessible, and mirror it? [03:05] s/got/git/ [03:06] you say "git annex get foo" and it goes and gets it, if you cloned from http:// it knows where to go [03:06] Oh, wow [03:06] That's rad! [03:06] Same with a clone over git://, or only http? [03:07] same with any clone, *except* for git:// actually [03:07] (because git:// protocol can't transfer arbitrary files) [03:07] but over ssh, sure [03:07] or rsync [03:07] Oh, okay. [03:08] What's the easiest way to expose a git repo over http? [03:08] (if you have an opinion) [03:08] well, I think you want to make a separate, bare repo, and there's this hook you have to enable. bit of a bother really [03:09] oh, I see [03:09] course for your repo of all the archiveteam stuff, you could just put it on github [03:09] yeah, 's what I planned to do [03:09] since you're telling it where to get all the files from the web [03:10] Since everything's web-remote [03:10] Yep :) [03:10] This is incredibly awesome, btw [03:10] that's sweet. [03:10] hmmm. [03:10] Athough now I'm wrestling cabal [03:10] looks like git-annex will solve some stupid problems I have [03:10] (trying to compile the latest git-annex, since the hackage version doesn't have the --file flag for addurl) [03:11] I'm not very fond of cabal. [03:11] with what version of ghc are you building it? [03:11] Could not deduce (Show a) arising from a use of `showHex' [03:11] Data/Digest/SHA2.hs:111:4: [03:11] [ 1 of 26] Compiling Data.Digest.SHA2 ( Data/Digest/SHA2.hs, dist/build/Data/Digest/SHA2.o ) [03:11] bound by the instance declaration at Data/Digest/SHA2.hs:109:10-39 [03:11] from the context (Integral a) [03:11] 7.4.1 [03:11] Latest [03:11] (the problem isn't in your thing, it's in one of the deps) [03:11] yeah, I had the same failre recently. Something broken there [03:12] did you work around it? [03:12] I can't remember [03:12] :P [03:12] I suppose I could build without hs3 [03:13] yep [03:13] in fact, I think that's what I did.. git merge no-s3 [03:15] I feel like I'm forgetting something dumb [03:15] V [03:15] 0 3:15AM:abuie@abuie-dev:~/master 23266 Ã git merge no-s3 [03:15] fatal: 'no-s3' does not point to a commit [03:16] wtf, you're underscor? [03:16] oh, yeah [03:16] sorry [03:16] try origin/no-s3 [03:17] Wheeeee [03:17] Thanks [03:17] now SketchCow needs to come in here as overfiend or something, and our collection of name confusion would be complete [03:18] haha [03:19] oops, forgot pcre-light [03:19] With addurl --fast, does WORM info get recorded? [03:20] (I know you can't do a sha backend with fast, but I wasn't sure about worm) [03:21] with the new version and --fast, it records the file size. Which is basically all WORM does [03:21] it doesn't use that backend, but it's the same level of assurance (ie, not much) [03:22] note that you can always `git annex migrate` later and it will pull it down from the web and convert to a checksum [03:23] Oh, okay, excellent [03:23] jfeifjsepfjdpsfjdkfjweklfjsdcx [03:23] git-annex: unrecognized option `--file=boingboing-2000-2005_files.xml' [03:24] git annex version? [03:25] git-annex version: 3.20120124 [03:25] whoops [03:25] My ruby $PATH and shell $PATH didn't match up [03:25] Working now [03:26] Heh, sorry [03:38] is batcave available for rsync'ing mobileme users again? drive is getting dangerously full [03:38] someone drove the batmobile thru the wall of batcave, I hear there's a new bat location somewhere [03:40] dcmorton: git pull [03:40] then you can run the uploader [03:44] underscor: got it.. thanks [03:44] np [03:49] closure: I'm imagining a giant git-annex repo with everything in archive.org [03:49] :D [03:49] That would be really neat [03:50] I wonder how well it would scale to that though [03:50] heh [03:52] you hit git scalability issues eventually [03:52] I've been working last 3 days in scaling git [03:52] git-annex to millions of files.. it does. but git, not so much :) [03:53] * closure has a directory with 300 copies of the linux kernel source tree in it. takes a while to rm [03:54] haha [03:54] Where does the problem in git lie? [03:54] yay for lvm, so I can just nuke the volume [03:54] Just inefficiency with an index of millions of files? [03:54] oh, it keeps every file in .git/index and rewrites it all the time [03:55] some other stuff. Facebook was complaining about this, it seems their source tree is insane and too big for git [03:55] haha [03:55] damn [03:55] hmm [03:55] Is there a way to make this work? [03:55] git annex addurl --fast --file=NUMBERS/geocities-3-d.7z.001 http://archive.org/download/2009-archiveteam-geocities-part1/NUMBERS/geocities-3-d.7z.001 [03:56] what, to create the directory? [03:56] Right now it spits out an angry error about a nonexistent directory [03:56] Yeah [03:56] I mean, I can write logic to create the directories, and cd in to each one, etcetera [03:56] But that feels real clunky [03:58] will fix [03:58] \o/ [03:58] Thanks! [04:00] Out of curiosity, why'd you choose haskell? [04:00] (I personally love the language, but I know there are others that don't agree) [04:05] fixed. [04:05] because it was time to learn haskell and also I wanted something solid [04:06] I didn't know you were into haskell [04:12] "a monoid is a monad in the category of endofunctors, what's the problem?" [04:12] er [04:12] monad -> monoid [04:13] whoops [04:13] closure: Yeah, I got into it this summer when I was in summer residential governor's school [04:14] one of my classes was mathematical problem solving, and haskell's lazy evaluation and excellent iterable abilities let me solve problems 20 times faster than anyone else in the class [04:14] (who wasn't using haskell) [04:14] * closure hits yipdw with a typoclassopedia [04:14] I haven't done anything beyond write math programs in it though [04:15] underscor: heh, that's about my exposure too -- I've been using it for Project Euler [04:15] for some weird reason [04:15] well damn, I wish I'd known, I could have had you slinging git-annex code [04:16] underscor: also, Paul Hudak's The Haskell School of Expression is most excellent [04:16] I will forever love fibs = 1:1:zipWith (+) fibs (tail fibs) [04:16] Ooh, I'll have to check it out [04:16] Yeah, that was the first problem we had to solve [04:16] 10 thousandth fib number [04:17] Only took me a few minutes, everyone else spent >45 minutes [04:17] closure: Hehee [04:18] I need to learn about how the rest of it works, though [04:18] Like type declarations and stuff [04:18] hmm, I need to get ahold of the Haskell School of Expression [04:18] My exposure's pretty much limited to fuckery in ghci [04:18] underscor: try executing fibs !! 10000 on that definition, it will fly [04:19] I know :D [04:19] yeah, I've been starting to try to learn about dependant types and type level programming [04:19] on my Xeon it completes in something I can't measure [04:19] the reason why it works is also mind-boggling (to me anyway) [04:19] lazy evaluation up the ass [04:19] fucking delicious :D [04:19] closure: is there a way to push git-annex files to another repository? [04:20] Coderjoe: git annex copy foo --to reponame [04:20] Coderjoe: Yeah, git annex copy file --to remotename [04:20] after you set up a git remote for it [04:20] closure: Damn :P [04:20] and what protocols does it handle? [04:20] and is there an ability to copy multiple files? [04:20] any protocol that can be used for a normal git remote (except git://) .. ssh, rsync, http [04:21] yes, foo can be a file or a directory [04:21] or any number of either [04:21] or leave it off to do the whole current directory :) [04:21] You can push via http? [04:21] Damn, never new that [04:21] +k [04:22] um, no, you can't upload bia http [04:22] aw [04:22] well, I don't support WEBDAV yet at least.. [04:22] because http can do it... [04:23] true, but server side it's a bit of a nightmare [04:23] yipdw: closure: No Haskell School of Expression, but any other haskell books here catch your eye? http://hastebin.com/tahewokuco.coffee [04:23] Sorry for the gross formatting [04:24] underscor: I haven't actually read any of those, though I did meet Bryan O' Sullivan at Erlang Factory once [04:24] he's a pretty swell guy [04:24] :P [04:24] so I guess his Real World Haskell book is probably good [04:24] That's neat [04:24] I've been meaning to look at the Bird too [04:24] my name is in RWH :) [04:24] I suppose I was coming more from do-you-want-a-copy-of-them [04:24] ohh [04:24] illicit [04:25] I SEE [04:25] hmm [04:25] um, Haskell School of expression is listed there :P [04:25] the thing I like about HSE is that it uses functional programming for applications that IMO one does not see very often [04:25] it actually focuses on FRP [04:25] Woah, look how blind I am [04:25] which is pretty need for an introductory book [04:25] neat, too [04:25] yipdw: Yeah, mildly illicit :P [04:26] This is basically library.nu, except private [04:26] closure: you're cited in there? [04:26] reviewer [04:26] ahh neat [04:26] along with like 500 other people [04:27] Whoops, client crashed [04:28] Oh, yeah, I remember seeing your name, closure [04:28] I was like "I know that guy!" [04:28] haha [04:28] (yes, I'm that freak who reads all the reviews) [04:29] http://research.microsoft.com/en-us/um/people/simonpj/papers/history-of-haskell/index.htm [04:29] Looks interesting [04:30] yes, I enjoyed that one [04:30] Simon Peyton-Jones sounds like a supervillain's name [04:30] along with R. Kent Dybvig [04:30] Found: [04:31] Why not: [04:31] fmap a (getStale file) [04:31] getStale file >>= return . a [04:31] man, I love hlint [04:31] jvdksvjlewjvldsjvkdjsvldsvlkds [04:31] you should find some SPJ talks. They're intorductory, but he's one of the best presenters I've ever seen [04:31] It's unseeded right now [04:31] oh, hlint suggests alternative constructions? [04:31] I'll link it when it finishes downloading [04:31] Probably overnight [04:42] hmm, a <$> getStale file is better though. hlint must not like applicatives [04:51] closure: yipdw: http://ksnd.it/dl/6613c98564e [04:51] It's a djvu [04:53] expired request [04:53] Weird [04:53] One sec [04:53] works here [04:53] http://ksnd.it/v/6613c98564e [04:54] ^ closure [04:54] damn wasted too much time listening to jason, now I have to do things [04:55] hahahah [04:55] 00000000-0000-0000-0000-000000000001 -- web [04:55] 85019413-6049-441c-a4a9-2f17dc0e734a -- here (ArchiveTeam Releases (@ IA)) [04:55] semitrusted repositories: 2 [04:55] untrusted repositories: 0 [04:55] dead repositories: 0 [04:55] local annex keys: 0 [04:55] local annex size: 0 bytes [04:55] known annex keys: 276 [04:55] known annex size: 57 gigabytes [04:55] backend usage: [04:55] URL: 276 [04:55] closure: Working beautifully now with the directory fix [04:55] Thanks! [04:56] personally, and especially for the archiveteam repo, I "git annex untrust web" [04:56] although if it's all archive.org urls, you *may* trust it :P [04:56] hehe [04:56] I planned on doing that, but after I add everything [04:57] Not really for any particular reason, I suppose [04:57] Makes it so I don't have to force on drop though [05:27] Wow! [05:27] where: &w_collection=*archiveteam* | size: 30,266,111,202 KB| [05:27] That's incredible! [05:28] Of course, mobileme will nearly increase it 10fold [05:28] But still! [05:33] kekekek [05:33] what's the size of cdbbsarchive out of curiosity [05:35] gimme a sec [05:35] btw, this is where fos lives, if y'all are curious about stats [05:35] http://ia700108.us.archive.org:8088/mrtg/ [05:35] You can see where someone started uploading mobileme [05:35] probably dcmorton [05:35] haha [05:36] DFJustin: where: &w_collection=cdbbsarchive | size: 366,592,842 KB [05:36] That would be pretty cool to have as a git-annex repo too [05:38] hehe [05:39] (Recording state in git...) [05:39] 1 5:39AM:abuie@abuie-dev:~/cdbbsarchive 23540 Ã ruby ../ia_annex.rb Gold_II [05:39] Gold_II is not a collection. It is, in fact, a software [05:39] Mirroring Gold_II because its parent is cdbbsarchive [05:39] addurl GOLD_II.cdr ok [05:39] addurl GOLD_II.jpg ok [05:39] Wheee [05:39] git clone ALLSHAREWARE [05:39] That's how it'll be once this finishes ;P [05:40] guess this fulfills jason's wish for an easy way to download collections [05:40] Yeah, assuming he likes it [05:40] (ping SketchCow, so he sees it) [05:41] Also need to write hooks that will automatically do junk when items within a collection are updated [05:41] Need to see if they'll let me touch petabox code [05:41] ;D [05:44] They won't. [05:44] But provide assistance. [05:45] Quick, the boss is here [05:45] Look busy! [05:45] SketchCow: Yeah, I know. Hank and BK are still sore about that incident in October [05:45] (rightfully so) [05:46] DFJustin: SketchCow: 8 down, 800 to go! http://hastebin.com/cowotigili.hs [05:49] closure: git-annex: /home/abuie/digibarn/.git/annex/tmp/remote_web_182_980_URL-s4619--http&c%%archive.org%download%DigibarnBruceDamerOnHowWilliamShatnerChangedTheWorldhistoryChannel%DigibarnBruceDamerOnHowWilliamShatnerChangedTheWorldhistoryChannel.thumbs%history-channel-shatner-digibarn-brucedamer__000390.jpg.log: openBinaryFile: invalid argument (File name too long) [05:49] :( [05:52] ouch.. [05:52] I will fix that tomorrow [05:54] I guess to really have ALLSHAREWARE you'll want http://www.archive.org/details/tucows as well [05:54] (the description on which is now out-of-date....) [05:54] closure: <3 [05:54] Thanks! [05:54] DFJustin: Yeah, I plan on doing it too [05:55] But once I have an automation workflow in place [05:55] Also, still wanting to hand-comb output for the time being [05:55] :) [05:56] they need to do another tucows pull too, "this just in...7 years ago" [05:56] the site is amazingly still up [05:58] wow [05:58] that's pretty impressive [05:58] I remember using tucows in like 2004 [05:59] back on my G3 AIO [05:59] :D [05:59] Fuckin' 10 years old [05:59] hahaha [05:59] I remember when "winsock software" actually meant something [06:00] :o [06:00] That was a while ago [06:00] ;P [06:00] but I'm still a whippersnapper compared to some folks in here :) [06:02] Very true [06:02] Man, I remember playing this game [06:02] Damn, what was it called [06:02] It was like, you pretended you were in a museum [06:02] and there were all these puzzles and stuff you had to solve [06:03] I remember that in 3rd grade START (gifted education) on Windows 95 [06:03] Man, now I really want to know what it was called >:| [06:05] in 3rd grade we had apple IIs [06:05] We had apple IIs until 2nd grade [06:05] underscor: fixed [06:06]