#archiveteam 2012-02-14,Tue

↑back Search

Time Nickname Message
00:14 🔗 soultcer http://www.archive.org/search.php?query=%28collection%3Atop_domains%20OR%20mediatype%3Atop_domains%29%20AND%20-mediatype%3Acollection&sort=-downloads
00:15 🔗 soultcer Why are all the top downloads from porn sites?
00:15 🔗 kennethre admittedly, that is a great website
00:15 🔗 Coderjoe because the internet is for porn
00:16 🔗 DFJustin is IA actually slurping the videos or just pages/images
00:16 🔗 BlueMax slurping, great choice of words
00:16 🔗 soultcer I get that someone would go to a porn site to watch porn, but why go to Internet Archive and download a warc archive full of porn instead of just going to the site directly?
00:17 🔗 BlueMax soultcer: free, easy and (likely) virus-free porn, can't really complain
00:17 🔗 DFJustin porn sites tend not to be great for usability
00:17 🔗 DFJustin pop-ups, interstitial ads, etc
00:17 🔗 soultcer Using the wayback machine probably won't improve that...
00:18 🔗 DFJustin not that I would know mind you
00:18 🔗 kennethre i wonder if that ever happens
00:18 🔗 kennethre vintage internet porn via the wayback machine
00:19 🔗 BlueMax "Why yes, I do believe that your plumbing is broken"
00:20 🔗 Coderjoe BlueMax: no, no. "Why yes, ma'am, I do believe that your plumbing is in need of repair." or "is in disrepair."
00:21 🔗 BlueMax Shown up on my own joke
00:22 🔗 Coderjoe "Oh, no! How shall I remit payment for services to be rendered?"
00:22 🔗 soultcer Joke's on Coderjoe for actually caring about the dialogue in porn movies.
00:22 🔗 BlueMax Good point!
00:22 🔗 BlueMax "If you don't notice the snatch I have some bad news for you..."
00:22 🔗 Coderjoe soultcer: he was going for some old-timey kinda-steampunky vibe
00:23 🔗 Coderjoe and this was before things got nekkid
00:23 🔗 soultcer ;-)
00:24 🔗 BlueMax http://gizmodo.com/5884684/why-ill-never-trust-a-human-with-my-data-again
00:25 🔗 Ymgve haven't read more than the first paragraph, but....backups?
00:25 🔗 Ymgve also, you won't lose FTP and SSH just because the domain name goes down
00:26 🔗 Ymgve oh, several people are affected
00:26 🔗 Coderjoe I have some experience with the service-provider-owner-vanishes situation
00:27 🔗 BlueMax I linked it because I thought it was slightly relevant
00:28 🔗 Coderjoe a BBS/ISP I used and helped out had rented colo space from someone. at some point, the guy just vanished. Long court battles to get permission to enter and recover hardware.
00:28 🔗 DFJustin the article author says he's backing his own stuff up
00:28 🔗 DFJustin wonder if we can help these guys http://outercircle.wikispaces.com/
00:29 🔗 DFJustin at the very least grab dns for the domains
00:29 🔗 soultcer what are the domains?
00:30 🔗 DFJustin http://outercircle.wikispaces.com/users_to_warn
00:30 🔗 Coderjoe "There's a wiki that some customers have set up to help people recover data[outercircle.wikispaces.com], but that only helps those who are aware there's a problem."
00:31 🔗 Coderjoe what about people that are not aware there is a problem
00:33 🔗 DFJustin yeah exactly
00:35 🔗 Coderjoe the whole "some people are already locked out, due to an expired certificate"
00:35 🔗 Coderjoe uh... no
00:35 🔗 Coderjoe tell your browser to ignore that problem to get back in and save your shit
00:35 🔗 Ymgve well, tell them, not us
00:36 🔗 SketchCow OK, slammed out that stuff.
00:36 🔗 BlueMax Hiya SketchCow.
00:37 🔗 DFJustin here's the article dude's twitter https://twitter.com/mat
00:37 🔗 Ymgve he mentions Metafilter, but as far as I can see there's not _that_ many posts about the hoster there
00:38 🔗 BlueMax I don't think Twitter is the place that people would complain about something like that.
00:39 🔗 DFJustin I mean somebody can poke him to connect with us
00:42 🔗 soultcer I see at least 234 hostnames on ns(1|2).sabren.com, I wonder if he contacted everyone of them
00:43 🔗 BlueMax It'd be better to operate under the assumption that he didn't
00:46 🔗 Coderjoe is that the only domain being used for DNS? also, what about people that were not using DNS there
00:49 🔗 Coderjoe from the gawker comments: CornerHost.com hosts 469 websites
00:52 🔗 Coderjoe how nice
00:52 🔗 Coderjoe another comment says "The owner/operator of Cornerhost is apparently active on github as of 4 days ago..." https://github.com/sabren
00:54 🔗 SketchCow By the way, I pulled SOMEBODY's twitter feed away this weekend.
00:54 🔗 SketchCow Someone was uploading new twitter stream captures.
00:54 🔗 SketchCow I can help us get it to the new machine and keep going, but I did kill that directory.
00:56 🔗 soultcer Well, in case somebody wants to have a go at warning cornerhost customers or downloading their stuff: http://pastebin.com/ge54giU0
00:58 🔗 Coderjoe ugh
00:59 🔗 Coderjoe http://businessprofiles.com/details/SABREN_ENTERPRISES_INC/GA-0145887
00:59 🔗 Coderjoe Status: Admin. Dissolved
01:00 🔗 Coderjoe apparently money problems: http://withoutane.com/
01:01 🔗 Ymgve 2009?
01:11 🔗 Coderjoe looks like he moved from Georgia to Texas
01:12 🔗 Coderjoe man
01:12 🔗 Coderjoe even his resume on his personal site is outdated: http://www.michalwallace.com/resume
01:13 🔗 Coderjoe lists nov 2001 to present for sabren enterprises, but the businessprofiles listing says it was dissolved
01:13 🔗 Ymgve we should send a repo team
01:13 🔗 Ymgve "you have failed to mantain data integrity, we are here to take your servers"
01:14 🔗 kennethre http://versionhost.com/
01:14 🔗 Coderjoe http://versionhost.com/contact/
01:15 🔗 Coderjoe if you have an emergency, you may also contact the missing guy on an atlanta-area number
01:16 🔗 Coderjoe so it looks like versionhost is another hosting company affected by this
01:16 🔗 kennethre old website
01:16 🔗 kennethre http://michaljwallace.com/
01:17 🔗 kennethre new one
01:17 🔗 Coderjoe jeez
01:17 🔗 kennethre https://twitter.com/#!/tangentstorm
01:17 🔗 kennethre he plays minecraft
01:18 🔗 SketchCow What exactly are we doing here?
01:18 🔗 Coderjoe https://twitter.com/mat/status/169226412992114688
01:19 🔗 SketchCow I expect to come out of this with a much stronger, effective, and reliable business system that isn't dependent on just me holding everything together.
01:19 🔗 SketchCow BEst line ever
01:19 🔗 kennethre lol
01:21 🔗 Ymgve gawker actually manages to do something GOOD?
01:24 🔗 kennethre "10-year-labor-of-love web framework" is always a bad sign
01:27 🔗 Coderjoe the "it was set to autorenew" attitude kinda sucks... does the registrar have current billing details to be able to do the autorenew? (especially after a 3-state-away move?)
01:28 🔗 kennethre every registrar i've ever been with just keeps the domain if you don't pay
01:28 🔗 kennethre to renew
01:29 🔗 Coderjoe many registrars institute a grace period
02:18 🔗 underscor t chronomex Do you know Pi?
02:18 🔗 underscor (aka Anthony Martinez)
02:30 🔗 rabidabid hello everybody
02:31 🔗 rabidabid I doubt there's much that can be done about this and I doubt it's going to be a big deal but I just wanna mention something ImageShack is doing
02:31 🔗 rabidabid I've been using their site since 2006 so I've accumulated a lot of images there. I checked the My Images section and see this message:
02:31 🔗 rabidabid You have 641 photos stored. Since you're over the 500 photo limit you'll need to upgrade to a Premium account or you'll only be able to keep 500 of your recent photos. Older photos will expire on the 1st of March.
02:32 🔗 rabidabid And the My Images section is pretty much they only place you'll see it. It's not on the front page and they didn't bother to send an email
02:33 🔗 DFJustin o shi
02:34 🔗 DFJustin luckily I only have 428 images, this is going to be hell for old forum threads and such though
02:34 🔗 rabidabid 500 is kind of a generous number so I doubt there will be too many people at risk here
02:35 🔗 rabidabid but maybe I'm underestimating
02:35 🔗 DFJustin hmm maybe
02:37 🔗 rabidabid I emailed them last night asking if there was a way to mass download my images (which there doesn't appear to be) and mentioned how they should publicize this more but I just got an automated response with links to their FAQs and a message that said "If required, your email will be answered as soon as possible."
02:37 🔗 rabidabid and well, I didn't get another email
02:38 🔗 rabidabid suffice to say I won't be using them anymore
02:38 🔗 rabidabid actually the only reason I've still been using them to this day is because that's just where all my other images are
02:57 🔗 Coderjoe rabidabid: you can use the mass-forumlink generator to get the links, then throw the list at wget or the like.
02:59 🔗 rabidabid hmm
03:00 🔗 Coderjoe at least I've done that in the past to make a big gallery page of everything
03:01 🔗 Coderjoe I might make a greasemonkey script to automate things a bit now, however
03:01 🔗 Coderjoe hmm. I'm at 444
03:38 🔗 chronomex underscor: yes, I know Pi and SaburWulf
03:43 🔗 DFJustin heh gave my friend a heads-up, turns out he has 1207 images on imageshack
03:43 🔗 rabidabid ick
03:43 🔗 DFJustin so may not be too rare
03:53 🔗 underscor chronomex: Awesome, that's cool
03:53 🔗 underscor Funny how small the world is
03:54 🔗 chronomex there are not a huge number of furries in the world
03:55 🔗 chronomex it just seems like so because they all whine about people who aren't furrries
03:55 🔗 underscor very true
03:55 🔗 underscor chronomex: are you furry?
03:56 🔗 chronomex fuck no
03:56 🔗 chronomex and I'm not scalie either
03:57 🔗 nitro2k01 So that has a separate name
03:57 🔗 nitro2k01 And you knew it did...
03:57 🔗 underscor nitro2k01: depends on who you ask
03:57 🔗 nitro2k01 I've heard people call everything furry
03:57 🔗 underscor I consider furry = all anthropomorphized critters, regardless of skin
03:57 🔗 underscor chronomex: No need to be so defensive :P
03:57 🔗 chronomex I have too many friends who are
03:57 🔗 Ymgve are you a microbie?
03:57 🔗 chronomex aaaaugh
03:57 🔗 underscor lolol
03:58 🔗 nitro2k01 "I want to be an amoeba!"
03:58 🔗 nitro2k01 http://scalie.deviantart.com/art/No-Eternity-coloured-186255315
03:58 🔗 nitro2k01 (Googled for scalie)
03:58 🔗 chronomex sabur was actually my roommate for some months before he left town to attend a loser school and fuck some girl who whines too much
03:59 🔗 underscor hahaha
03:59 🔗 chronomex now they're engaged
03:59 🔗 underscor oic
03:59 🔗 chronomex she still is bitchy
04:00 🔗 chronomex let's stop this conversation
04:01 🔗 underscor chronomex: Do you want to, or shall I?
04:01 🔗 underscor Fuck it, I want to do it for once
04:02 🔗 underscor WOOT WOOT WOOT OFF TOPIC SIREN
04:03 🔗 BlueMax HONK HONK HONK OFF TOPIC HORN
04:03 🔗 nitro2k01 WOOF WOOF WOOF FURRY SIREN
04:03 🔗 underscor hahahahaha
04:03 🔗 underscor that was excellent
04:04 🔗 nitro2k01 woot mostly means "win" or similar btw
04:06 🔗 underscor yeah, I suppose
04:07 🔗 chronomex I have it aliased to /ots
04:07 🔗 chronomex woop woop woop off-topic siren
04:07 🔗 underscor oh, yeah
04:07 🔗 underscor It's woop, not woot
04:07 🔗 underscor damn
04:08 🔗 BlueMax Need a new offtopic siren? Why not Zoid-WOOP WOOP WOOP WOOP WOOP WOOP
04:09 🔗 nitro2k01 Did someone say zoid? Because I think I heard someone say zoid. http://canv.as/ugc/original/e922d098bd6083d2948bb5235203d8eed192b2f1.jpeg
04:30 🔗 Coderjoe yeah... the automated downloader scripts on future projects need to have a way to have the tracker tell the downloaders to stop cleanly
04:32 🔗 SketchCow I just killed rsync on batcave
04:32 🔗 SketchCow Because I need to bzip2 stuff before it can go up.
04:32 🔗 SketchCow Let's put it this way. It had 15 simultaneous rsyncs going AND was transferring files AND was doing a massive bzip2
04:47 🔗 closure ionice
04:48 🔗 closure often makes these things more pleasant
04:52 🔗 DFJustin dumping some 500MB HDDs from a garage sale, what could be on them!
04:56 🔗 underscor Setting up an openbsd box to replace my actiontec router
04:56 🔗 underscor I'm tired of dealing with it's fucking 1024 entry nat table
04:57 🔗 underscor its*
05:03 🔗 Coderjoe http://img22.imageshack.us/img22/9867/cableunplugged.jpg
05:09 🔗 closure holy wtf
06:40 🔗 Coderjoe bahahah
06:40 🔗 Coderjoe https://afaikblog.wordpress.com/2012/02/10/a-new-approach-to-gnome-application-design/
06:40 🔗 Coderjoe I think they're early for april fools day
07:35 🔗 Coderjoe username: eigenart
07:45 🔗 Coderjoe wonder if that is art made using eigenvalues
11:00 🔗 SketchCow SLAMMING METADATA
11:00 🔗 SketchCow SO BORING
11:01 🔗 ersi They see me slammin'
11:01 🔗 ersi they snorin'
11:01 🔗 ersi .. wut
11:02 🔗 SketchCow http://www.archive.org/details/bbc-rd-reports-1954-21
11:03 🔗 ersi haha, awesome subject
11:03 🔗 SketchCow http://www.archive.org/details/bbc-rd-reports-1954-23 is also good.
11:04 🔗 SketchCow I am adding things quickly, which merely means I'm adding it slowly.
11:04 🔗 SketchCow I'm trying to listen to interviews and podcasts
11:04 🔗 SketchCow I'm listening to a podcast, digitizing two tapes on two laptops
11:04 🔗 SketchCow Uploading friendster to archive.org
11:04 🔗 SketchCow And doing this metadata
11:04 🔗 SketchCow And I feel like I'm behind and moving too slow.
11:04 🔗 SketchCow That's the curse of what I have.
11:04 🔗 ersi whoa, I'm not really all that suprised @ BBC Research - it's within their focus/realm of knowledge.. but still, dang
11:06 🔗 SketchCow Yeah
11:07 🔗 SketchCow I'm happy to put it in
11:07 🔗 SketchCow Without this metadata, it's impossible to negotiate this thing.
11:07 🔗 ersi Metadata is very important
11:08 🔗 SketchCow http://www.archive.org/details/bbc-rd-reports-1954-28
11:08 🔗 SketchCow I agree these are all interesting essays
11:08 🔗 SketchCow Hence my wanting them up.
11:10 🔗 SketchCow I can add a new paper every 10 seconds right now.
11:10 🔗 SketchCow By hand.
11:10 🔗 SketchCow But there's 1,338 papers
11:10 🔗 SketchCow Calculate that, if you have a moment.
11:10 🔗 SketchCow I'
11:10 🔗 SketchCow I'm still adding.
11:11 🔗 ersi That'll take a while
11:11 🔗 SketchCow Multiply 1338 x 10 seconds.
11:12 🔗 SketchCow If you could
11:13 🔗 SketchCow 223 minutes. You took too long
11:22 🔗 ersi yeah, was bashing a systems integrator a little with a colleague
11:30 🔗 SketchCow http://www.archive.org/search.php?query=collection%3Abbc-rd-reports&sort=-publicdate&page=27 looking good!
11:37 🔗 SketchCow Found 1tb of doubled data.
11:37 🔗 SketchCow Always quality.
12:14 🔗 SketchCow This report deals with the statistical analysis of questionnaires in which observers enter their opinions under a series of graded classification, such as "Bad", "Indifferent", "Good".
15:43 🔗 chronome1 always a tricky task
16:41 🔗 SketchCow I've been blowing these R&D metadata sets in, and it feels like I've made all this headway, but it turns out I barely have.
16:41 🔗 SketchCow Even with short work, it's hours left.
16:48 🔗 alard It can't be automated?
16:48 🔗 SketchCow It's about as automated as I can get it.
16:48 🔗 SketchCow Unless I REALLY want to write a rather needlessly complicated thing.
16:49 🔗 SketchCow Which will have little general use.
16:49 🔗 SketchCow I can add a new entry every 20 seconds.
16:49 🔗 SketchCow But at 1,300+ entries to do, that's still a lot of time.
16:50 🔗 SketchCow http://www.archive.org/details/bbc-rd-reports-1957-15 but sexy!
16:53 🔗 SketchCow Motherfuckers wrote a lot of reports.
16:55 🔗 kennethre haha
17:24 🔗 SketchCow kennethre: Help me understand the system you're using.
17:24 🔗 kennethre SketchCow: what aspect of it?
17:25 🔗 SketchCow Like, what it is you're using that's then going into the machine.
17:25 🔗 SketchCow Because I think I may need you to set up a machine that I use as an augmenter for batcave/fortress
17:25 🔗 SketchCow Because I don't think the internal archiveteam portion of the infrastructure can ever handle the amount of pain your system is capable of.
17:26 🔗 SketchCow At least jamming you directly into the mobileme section, we can set something up that you see the result
17:26 🔗 SketchCow You were getting a great rate, but it left the system in a gutter wearing its bloody panties as a hat
17:27 🔗 kennethre haha, it was essentially 300 VPS's running on a large number very very large ec2 boxes w/ a stupid amount of bandwidth
17:27 🔗 SketchCow Right
17:27 🔗 SketchCow I think we need a smartybox on your side that is then getting access to the mobileme subcollection
17:27 🔗 kennethre each instance was just running the seesaw script
17:27 🔗 SketchCow See if you can get me a smartybox
17:56 🔗 kennethre SketchCow: I don't follow
18:28 🔗 Schbirid is there a list availble of all the google groups?
18:41 🔗 Schbirid SketchCow: http://video.fosdem.org/ conference videos, not sure if you know those already
18:49 🔗 SketchCow Jesus webm
18:49 🔗 Schbirid :)
18:49 🔗 SketchCow Why not just do it in quickcam and get it over
18:50 🔗 SketchCow http://www.archive.org/details/Fosdem2011Presentations
18:54 🔗 DFJustin lol
18:59 🔗 DFJustin man, I dunno if archive.org has enough space for these here xvids, I mean some of them are like 700mb
19:01 🔗 DFJustin it's like youtube stockholm syndrome
19:08 🔗 SketchCow DoubleJ: Your slot is back
19:15 🔗 Nemo_bis http://creativecommons.org/weblog/entry/31415
19:15 🔗 Nemo_bis worth archiving?
19:17 🔗 closure looking forward to seeing this tomorrow http://video.fosdem.org/2012/lightningtalks/git_annex___manage_files_with_git,_without_checking_their_contents_into_git.webm
19:17 🔗 SketchCow I want to understand your annex thing
19:18 🔗 closure that may help
19:18 🔗 closure should be a good 15 minute into, I hope
19:18 🔗 soultcer Git-Annex is the solution to all my file management problems
19:18 🔗 closure oh, you use it?
19:19 🔗 SketchCow The archiveteam git-annex needs a porn folder
19:19 🔗 closure I've just been working on scaling git-annex to not leak memory when adding millions of files.
19:20 🔗 soultcer Extensively. I have one about 3 TB repository of media files with my music, videos and backups in it. Thanks to the numcopies setting I can always sleep well knowing that I won't lose a file
19:20 🔗 SketchCow I'd like to know when that's the case, because I'd like to set one up for our saves.
19:20 🔗 closure soultcer: awesome.. Maybe I forgot, but I don't remember you mentioning you used it before
19:21 🔗 soultcer I do most of the urlteam stuff in git-annex as well. Various hosts run scrapers, and I simply use git annex get . to fetch all their data ;-)
19:21 🔗 closure well, I think the scalability is fixed. At least, the limiting factor now is git's own memory bloat with a million files
19:21 🔗 closure 2625 joey 20 0 90324 56m 3260 S 0.0 0.7 1:28.94 git-annex
19:21 🔗 closure 2821 joey 20 0 198m 182m 1032 R 100.0 2.3 0:24.16 git
19:21 🔗 closure soultcer: jesus, I had no clue
19:21 🔗 soultcer I'd tell you how big the repositories are but I did a pacman update today and now git-annex is broken
19:22 🔗 closure ahahah
19:22 🔗 closure new ghc?
19:22 🔗 SketchCow http://vimeo.com/creativecommons classy
19:23 🔗 soultcer Some major upgrades to various shared libraries, because a lot of other stuff broke as well
19:26 🔗 kennethre soultcer: he pronounces git wrong, his opinion is irrelivant
19:27 🔗 soultcer Who pronounces git wrong?
19:27 🔗 kennethre the guy in that video
19:27 🔗 kennethre oh closure posed it.
19:27 🔗 kennethre closure: see above :)
19:27 🔗 closure lol, how does richih say git?
19:27 🔗 soultcer Like JIT
19:27 🔗 kennethre yeah
19:27 🔗 kennethre it's weird
19:27 🔗 closure oh well, I think he's german
19:28 🔗 SketchCow Found a way to slice 20 percent time off the adding of metadata to archive.org.
19:29 🔗 soultcer Given that git-annex speaks S3, an archiveteam.git that uses the Internet Archive to store the actual contents would work, right?
19:29 🔗 closure git-annex has special archive.org S3 upload support
19:30 🔗 SketchCow I need to think through that.
19:30 🔗 closure but, I'm not 100% happy with it.
19:30 🔗 SketchCow Bear in mind I am really treading water regarding it
19:30 🔗 SketchCow I'll happily play it up, but I understand it only surface-wise.
19:30 🔗 soultcer At least for read access that would be pretty awesome, though I guess the web special remote would work too
19:30 🔗 SketchCow I only understand git a little
19:31 🔗 closure Currently, each archive.org bucket has to be configured separetly in git-annex
19:31 🔗 closure I've used it, but if I had a ton of collections, it might not work
19:31 🔗 closure http://git-annex.branchable.com/tips/Internet_Archive_via_S3/
19:33 🔗 closure soultcer: yes, web special remote is fine for stuff already in the IA
19:35 🔗 SketchCow Keeping in mind I am dangerously retarded....
19:35 🔗 closure sure, let's get back to basics.. git. big files.
19:35 🔗 SketchCow using git-annex basically lets you know when network connected items everywhere are providing resources that you can acquire from any other aspect
19:36 🔗 SketchCow So I go "where the fuck is my traci lords collection" and it goes "Oh, that's on the accounting server at your old work"
19:36 🔗 closure yeah, basically. If you have access to the other remotes.
19:36 🔗 SketchCow Since I checked them in
19:36 🔗 closure if work killed your accout, you're screwed, obviousoly
19:36 🔗 SketchCow (I figured I'd use real-world examples)
19:36 🔗 SketchCow Well, that would be the equivalent of a disk failure
19:36 🔗 closure yep
19:37 🔗 SketchCow So I have traci on the old work server AND on two usb drives at my friend's house
19:37 🔗 SketchCow Do they need to be connected all the time?
19:37 🔗 SketchCow Or do I call dave at 3am telling him to hook the USB to the server in the living room
19:37 🔗 SketchCow And then git-annex goes "ah, there we are"
19:37 🔗 closure nope, things can be disconnected and offline. That's nearly the default state.
19:37 🔗 closure as soon as it can get to the drive, it's good
19:38 🔗 SketchCow And then it rebuilds the traci folder on my mom's fuse-enabled gmail account
19:38 🔗 closure I have a whole shoebox of 1 TB drives that I plug in when git-annex wants data from one of the,m
19:38 🔗 SketchCow That's viciously good
19:39 🔗 closure I have drives I sneakernet around and run git-annex on whatever computer they're plugged into, it keeps track of where everything is
19:39 🔗 SketchCow Two questions, one tech, one pr
19:40 🔗 SketchCow tech: I assume there's some client program running that is querying git-annex, on windows boxes or linux or whatever
19:40 🔗 SketchCow pr: has git-annex had any major announcement or is that a slow burn
19:40 🔗 closure tech: git-annex is a single standalone binary. No server. You just run it
19:41 🔗 closure (I have not ported it to windows though, just unix/linux/freebsd/osx)
19:41 🔗 DFJustin also what happens when you find out traci was 16 in that video and you want to nuke everything, are there logs and filesystem tables everywhere
19:41 🔗 closure DFJustin: yes, the forensics people will be very happy with the available data trail.
19:41 🔗 closure :P
19:42 🔗 SketchCow Dude, I mean, fuck
19:42 🔗 SketchCow Everyone knows, at the end, Traci will always betray you.
19:42 🔗 SketchCow It's the yin to her delicious yang
19:43 🔗 closure pr: Word's been getting out. There was an article in Linux Weekly News http://lwn.net/Articles/418337/ .. I presented git-annex at the Gittogether conference this fall
19:43 🔗 SketchCow I'll make noise next week
19:43 🔗 SketchCow It's not an archiveteam project but I do think it has archiveteam principles
19:43 🔗 SketchCow Saving traci lords from oblivion
19:44 🔗 soultcer Let's not forget the automatic features. You can specify "keep x copies of all files in this directory" and it will make sure you have enough copies around by a) offering to only copy files that have too few copies and b) not allowing you to drop files with too few copies
19:45 🔗 SketchCow closure: Do you know the story of "no cat"
19:45 🔗 closure also you can have it store stuff encrypted. There are features, yes :) (Might even help with your Traci problem)
19:45 🔗 closure no cat?
19:46 🔗 SketchCow Explaining Wireless Telegraph.
19:46 🔗 SketchCow "You see, wire telegraph is a kind of a very, very long cat. You pull
19:46 🔗 SketchCow his tail in New York and his head is meowing in Los Angeles. Do you
19:46 🔗 SketchCow signals here, they receive them there. The only difference is that
19:46 🔗 SketchCow understand this? And radio operates exactly the same way: you send
19:46 🔗 SketchCow there is no cat."
19:46 🔗 closure aha, yes, heard er
19:46 🔗 SketchCow This is no cat all the way
19:46 🔗 closure fine praise
19:46 🔗 SketchCow Imagine a fileserver that has all your drives connected, keeping track of things....except there is no fileserver.
19:47 🔗 soultcer That's git-annex
19:47 🔗 SketchCow Right
19:47 🔗 SketchCow I work to make complicated concepts easier, it's all I do all day.
19:47 🔗 SketchCow Credit me with some influence or something, so I can die happy
19:48 🔗 SketchCow "I coded this by the glow of jason's guiding light"
19:49 🔗 closure the light of his towering ire
19:50 🔗 closure srsly, I would certianly not have written the same program if I were not in archiveteam
19:50 🔗 closure (being in a cabin with only dialup helps too)
19:52 🔗 SketchCow taking full credit for inspiration
19:52 🔗 * SketchCow poses like george washington crossing the delaware
19:53 🔗 DFJustin SketchCow: stuffs for archiveteam collection http://www.archive.org/details/something-awful-forums-2001 http://www.archive.org/details/konachan-siterip-2009
19:54 🔗 soultcer Does git-annex do checksumming on web remotes?
19:54 🔗 closure soultcer: git annex addurl <url> pulls it down and checksums it, yes
19:54 🔗 closure --fast does not
20:00 🔗 SketchCow DFJustin: Both swapped over now
20:07 🔗 soultcer Sweet, sha256 is now the default backend
20:10 🔗 closure soultcer: I feel a git annex status disksize comparison coming on
20:11 🔗 closure http://pastebin.com/hR87NQge
20:13 🔗 DFJustin I will note that neither the konachan nor SA stuff has been deleted
20:13 🔗 DFJustin so this is more of a just in case kinda thing
20:14 🔗 DFJustin sorry if that was unclear
20:16 🔗 soultcer closure: http://pastebin.com/PjGs3MN9
20:16 🔗 soultcer Though pretty much all with numcopies=3
20:17 🔗 closure I like how you use Disk: and Host: in your descriptions
20:17 🔗 soultcer Well it would still be pretty easy, since hosts are named for BSG characters, and disks are named for austrian politicians
20:19 🔗 DFJustin or I should say, the original sites are still up but the packs are gone from megaupload and filesonic (although the kona stuff is still on bittorrent)
20:20 🔗 closure hmm, yeah, it only shows the size of one copy.. perhaps I should make it say known annex size: 2 terabytes (plus 4 terabytes of redundant copies)
20:20 🔗 soultcer That would be pretty sweet
20:20 🔗 soultcer Because then I know when to buy a new hdd
20:20 🔗 closure or some such measure. Reundancy: 300%
20:21 🔗 closure oh, I'd have to walk every location log to do it, pretty expensive I think
21:19 🔗 alard kennethre: Do you know how much diskspace a dyno on heroku can use?
21:19 🔗 kennethre alard: it's whatever
21:19 🔗 kennethre it needs to be download and upload though, because a dyno can be killed at any time
21:19 🔗 kennethre for splinder i was rsyncing every 2 minutes in the background
21:19 🔗 alard Yes, but could you download, say, 20GB?
21:20 🔗 kennethre depends :)
21:20 🔗 kennethre typically yes
21:20 🔗 kennethre there's on official limit
21:20 🔗 kennethre it's not typically an issue
21:20 🔗 kennethre the host machine can't run out of disk space, obviously
21:20 🔗 kennethre but i downloaded 2TB overnight
21:21 🔗 kennethre and some of them were 20GB+
21:21 🔗 alard So perhaps we could make something that lets you download and make chunks of 20 or 40GB, which you could then upload directly to archive.org.
21:22 🔗 alard 10 files of 20GB would fill an item, and I don't think 10 files is too much.
21:22 🔗 kennethre i think that'd be pretty prone to failure
21:23 🔗 kennethre what's wrong with uploading one user at a time?
21:23 🔗 alard The problem seems to be in SketchCow's batcave (or fortress) where you rsync to.
21:24 🔗 alard You rsync to SketchCow, and then it's bundled and uploaded to the archive.
21:24 🔗 kennethre i don't have to run at that rediculous capaticy
21:24 🔗 alard No, that's true.
21:24 🔗 alard And probably simpler.
21:25 🔗 kennethre :)
21:25 🔗 alard So SketchCow should find a good rate.
21:26 🔗 kennethre that'd be ideal for me
21:26 🔗 kennethre i doubt i could run like that constantly without someone noticing anyway

irclogger-viewer