#archiveteam 2012-04-15,Sun

↑back Search

Time Nickname Message
00:32 πŸ”— DragonDon Greetings all
00:40 πŸ”— mistym Hi DragonDon!
00:40 πŸ”— DragonDon Hi mistym , how goes it?
01:19 πŸ”— mistym DragonDon: Pretty good, thanks! What about you? (Sorry, totally missed your message!)
01:49 πŸ”— godane1 SketchCow: how do you delete .pureftpd-upload files on archive.org?
01:49 πŸ”— godane1 i'm trying to upload a maximum pc cd from june/july 2010
01:50 πŸ”— godane1 but my internet sucks
01:52 πŸ”— godane1 can anyone help?
01:58 πŸ”— Wyatt|Wor godane: Can you not use rsync?
01:59 πŸ”— godane i'm using gtkftp
01:59 πŸ”— godane *gftp
02:04 πŸ”— Wyatt|Wor Wait, is that dotfile you mention for resuming the upload?
02:11 πŸ”— godane i think so
02:11 πŸ”— godane but gftp will not know to resume it
02:11 πŸ”— godane anyways i'm 44% done
02:12 πŸ”— Wyatt|Wor Weird, I thought gfp could do that.
02:12 πŸ”— Wyatt|Wor Try wput?
02:12 πŸ”— godane it could if the file name is the same
02:12 πŸ”— DragonDon mistym, no probs, I ended heading off to a shower and getting dressed for the day :)_
02:13 πŸ”— godane i just want to know if i can remove the .pureftpd-upload files
02:13 πŸ”— Wyatt|Wor I don't think the client matters as long as you have the dotfile to resume on the server-side
02:13 πŸ”— godane so there not uploaded to archive.org cluster
02:14 πŸ”— Wyatt|Wor It should just get removed when the upload completes.
02:14 πŸ”— godane I'm reuploading the file
02:15 πŸ”— godane its not resumeing any .pureftpd-upload
02:15 πŸ”— godane so those files shouldn't get delete
02:15 πŸ”— godane but it want them to be
02:18 πŸ”— Wyatt|Wor In the future, since you say you have a bit of a flaky connection, you should be able to enable resume in gFTP, or use a resume-capable client.
02:29 πŸ”— godane it doesn't work if file name change
02:29 πŸ”— godane file is MPC_Buildit.iso
02:30 πŸ”— godane also since of my flaky connection i don't know if resume is best cause you could be getting a corrupted file
02:31 πŸ”— godane again don't care about resumeing now
02:31 πŸ”— godane i just want the .pureftpd-upload.* files delete
02:31 πŸ”— godane nothing else
02:35 πŸ”— godane so again how do i remove .pureftpd-upload.* files?
02:36 πŸ”— godane not crap about using the resume button
02:36 πŸ”— godane cause MPC_BuildIt.iso != .pureftpd-upload.*
02:37 πŸ”— godane fucking crap
02:37 πŸ”— godane one of the .pureftpd-upload.* files did rename
02:38 πŸ”— godane but uploaded a full 688mb iso just to get some 4121545648 image
02:38 πŸ”— godane what the hell?
02:41 πŸ”— godane now i'm resuming after the complete connection
02:41 πŸ”— godane may have to resume again cause of there being some other .pureftpd-upload file
02:44 πŸ”— chronomex .putrified
02:45 πŸ”— godane i'm sorry that i'm pissed off
02:45 πŸ”— godane but i may have upload 4 times has much data for one fucking iso
02:47 πŸ”— Wyatt|Wor I'm guessing your FTP account doesn't have the privs needed to rm and you don't have a shell account?
02:47 πŸ”— godane i just hope archive.org or gftp doesn't use rename the other .pureftpd file cause i will just delete what i have
02:47 πŸ”— godane i have standard upload account
02:48 πŸ”— godane i can't ever put my isos in software cd shareware part of archive.org
02:48 πŸ”— Coderjoe http://archive.org/post/267549/stale-pureftpd-upload-file-blocks-checkin
02:51 πŸ”— Coderjoe wow
02:51 πŸ”— Coderjoe a familiar name in that thread (not including tracey pooh)
02:52 πŸ”— Wyatt|Wor haha, I see it.
02:53 πŸ”— Wyatt|Wor The S3 method seems best then.
02:54 πŸ”— Coderjoe it does allow for the easiest automation
02:54 πŸ”— Coderjoe (pushing 4990 videos up using it right now)
02:55 πŸ”— Wyatt|Wor Curious, though, Tracey specifies a preference for curl. I've always seen curl and wget as roughly equivalent-- what differences inform the decision to use one or the other?
02:56 πŸ”— Coderjoe for the s3 interface? probably because it is the one that sam documented in the api doc
02:56 πŸ”— Coderjoe http://archive.org/help/abouts3.txt
03:00 πŸ”— Coderjoe and i guess that hard drive fix line is just part of the legend. i could have sworn i've seen times where it is there and others when it isn't
03:00 πŸ”— mistym http://www.kickstarter.com/projects/120873716/your-world This TOTALLY deserves to have made it to Kickstarter
03:00 πŸ”— mistym Extremely professional site too http://www.yourworldinc.com/
03:03 πŸ”— Wyatt|Wor I asked this yesterday, but different people seem active, so I'll try again. If someone is interested in backing up their community/content on IA, how best to go about that?
03:04 πŸ”— Wyatt|Wor Well, archiving, not "backing up"
03:08 πŸ”— shaqfu mistym: WoW+SL?
03:10 πŸ”— shaqfu Oh, hah; reading through this, it's like the biggest pipe dream ever
03:11 πŸ”— aggro What's wrong with straight-up WoW?
03:11 πŸ”— aggro Every RPG I've ever played has the same basic structure anyway.
03:12 πŸ”— aggro tanks, healers, dps, yadda yadda
03:12 πŸ”— mistym shaqfu: Yeah, it's basically "here are some minor quibbles I have with WoW", not a concept for a game
03:17 πŸ”— shaqfu And funding it on $1.1M on that scope, yikes
03:17 πŸ”— shaqfu No wonder he's getting no money - it's an obvious bomb
03:19 πŸ”— Wyatt|Wor Hahah "I am an idea man."
03:23 πŸ”— godane finally upload: http://archive.org/details/Maximum_PC_CD_June_July_2010
03:24 πŸ”— balrog_ph shaqfu: That's an utter joke
03:24 πŸ”— Coderjoe oh man. i just had a painful thought: building a 6502 emulator in minecraft's redstone circuits
03:25 πŸ”— shaqfu balrog_ph: ?
03:25 πŸ”— balrog_ph That kickstarter
03:25 πŸ”— shaqfu Oh, yeah
03:51 πŸ”— SketchCow Back
03:51 πŸ”— SketchCow Morgan says hi.
04:18 πŸ”— chronomex hi morgan
05:06 πŸ”— winr4r morning
05:18 πŸ”— dnova mornin
05:18 πŸ”— shaqfu Mornin'
05:32 πŸ”— SketchCow Borp
05:33 πŸ”— winr4r morgan is a bad-ass name
05:33 πŸ”— shaqfu And a historical unit of measure!
05:34 πŸ”— winr4r "In genetics, a centimorgan (abbreviated cM) or map unit (m.u.) is a unit for measuring genetic linkage." <- how about that!
05:35 πŸ”— SketchCow http://www.esquire.com/features/robert-caro-0512
05:35 πŸ”— shaqfu Wait, really? I was thinking of the Dutch measure of land
05:35 πŸ”— SketchCow Read up on the crazy
05:42 πŸ”— winr4r wow
05:46 πŸ”— shaqfu ...sheesh
05:51 πŸ”— Coderjoe where's the pc world article again?
05:52 πŸ”— Coderjoe ah. found it
05:52 πŸ”— winr4r http://www.pcworld.com/article/253672/the_archive_team_rescues_user_content_from_doomed_sites.htmlw
05:52 πŸ”— winr4r too slow
05:52 πŸ”— winr4r heh
05:52 πŸ”— winr4r -w btw
05:53 πŸ”— winr4r (though apparently it makes no difference)
05:55 πŸ”— Coderjoe (from the 35mm vs digital article)
05:55 πŸ”— Coderjoe It certainly isn't. James Cameron's Avatar got the ball rolling back in 2009. The 3-D blockbuster could only be shown via digital projectors, and so the first wave of theaters upgraded in a hurry.
05:55 πŸ”— Coderjoe bull shit.
05:55 πŸ”— Coderjoe my local IMAX was showing it on dual 70mm strips
05:55 πŸ”— Cameron_D ohey my name
05:57 πŸ”— Coderjoe perhaps on the 35 side, that is true, but not universally
05:57 πŸ”— SketchCow Cameron's original hope for Avatar was that it could be a 3D-only proposition, but however quickly cinemas scurried to update their capabilities, it wasn't quite quickly enough. The film is being shown in several formats, including conventional 2D. Whether audiences favour the 3D (and IMAX 3D) versions is a significant factor in how far Avatar will spearhead the 3D-ification of effects blockbusters to come.
05:59 πŸ”— SketchCow That smacks strongly of the LA Weekly reporter finding "avatar only to be released digital" articles and not finding the "pushback causes some standard-issue formats to be released too" articles.
06:00 πŸ”— winr4r yes, it does
06:02 πŸ”— Coderjoe And then there was Valentine's Day. Instead of a 35mm print, the studio offered Belove either a DCP or a DVD of Breakfast at Tiffany's.
06:02 πŸ”— Coderjoe hahaha
06:02 πŸ”— Coderjoe because DVD is even half of what 35mm is
06:03 πŸ”— SketchCow The DVD offer was awesome
06:05 πŸ”— winr4r DVD? as in a standard definition DVD?
06:05 πŸ”— SketchCow Yeah, nice offer
06:05 πŸ”— SketchCow Who knows if that's real
06:06 πŸ”— SketchCow That LA Weekly person didn't really double-source, it seems.
06:06 πŸ”— SketchCow Bet they didn't even call the studio to check
06:06 πŸ”— SketchCow I was more fascinated that DCP won out as the internal format
06:11 πŸ”— winr4r "A few months later, in January, one of the companies that makes the raw, unprocessed film stock, Eastman Kodak, filed for bankruptcy."
06:11 πŸ”— winr4r ...but the film division was one of the parts of kodak that was actually profitable, so tell me how that's related
06:12 πŸ”— chronomex shhhh
06:12 πŸ”— winr4r (thank god, i'd die if ektar 100 went away)
06:13 πŸ”— SketchCow Shhhhhh
06:13 πŸ”— SketchCow This is not a great article.
06:13 πŸ”— SketchCow It informs to a few basic structures of the industry that are worth knowing.
06:14 πŸ”— SketchCow But it is ultimately weaksauce
06:34 πŸ”— Wyatt|Wor How does the relationship between 35mm movie film and 35mm photo film work? Is it the same stuff, only the final print is on film stock vs. photo paper?
06:35 πŸ”— SketchCow No
06:35 πŸ”— SketchCow There's other stuff.
06:35 πŸ”— winr4r same size, different sprockets, sometimes different emulsions
06:36 πŸ”— Coderjoe same width, different sprockets, different direction of travel...
06:36 πŸ”— Wyatt|Wor Okay, so it's not going to have the same grain characteristics?
06:37 πŸ”— Coderjoe that's largely the emulsions
06:37 πŸ”— winr4r Wyatt|Wor: the area of a photo from a 35mm still camera is greater
06:38 πŸ”— Coderjoe different emulsions have different grain sizes
06:38 πŸ”— winr4r so if the emulsions are the same, the 35mm movie camera will have more grain (but the emulsions aren't always the same)
06:38 πŸ”— Wyatt|Wor Because it needs to handle the stresses of running it through the gears repeatedly?
06:38 πŸ”— Wyatt|Wor (That's re: still camera area being greater)
06:40 πŸ”— Coderjoe unless you're in the low end of the market and working with reversal stock, you generally do not run your camera footage through a projector.
06:40 πŸ”— Coderjoe heck, in 35, your camera footage doesn't even have sound
06:40 πŸ”— winr4r Wyatt|Wor: oversimplifying, but the "height" dimension of a still shot is used as the "width" of a 35mm movie shot
06:41 πŸ”— Coderjoe 35mm still has the film pass the shutter horizontally, while 35mm motion has it going past vertically
06:41 πŸ”— winr4r yes, better way of putting it
06:41 πŸ”— Coderjoe ... unless you're using lucasfilm's rescued and adapted vistavision equipment
06:45 πŸ”— Wyatt|Wor Okay, so this is all a lot more complicated than I even realised.
06:46 πŸ”— Wyatt|Wor But basically, for me, what it boils down to is this: A photographer friend of mine once said a good rule of thumb was 35mm is roughly equivalent to a good 8MP CCD. Sound about right?
06:46 πŸ”— winr4r Wyatt|Wor: an 8mp bayer-interpolated CCD? no
06:46 πŸ”— chronomex now we're getting into internet-holy-wars territory
06:46 πŸ”— winr4r yes
06:47 πŸ”— winr4r we are
06:47 πŸ”— Wyatt|Wor Analogue vs. Digital is a vast theatre of combat. Well, I'm sorry about that.
06:48 πŸ”— winr4r they're very different things, they even resolve detail very differently
06:49 πŸ”— winr4r in any case i'm hungry
06:49 πŸ”— chronomex I REFUSE TO BELIEVE THAT THERE IS NO ONE STANDARD OF COMPARISON
06:49 πŸ”— Wyatt|Wor Well anyway, my point with all this is this DCP thing looks like it maxes out at 4096×2160... Am I at least on the right track to have the impression that that's a bit of a step backward?
06:50 πŸ”— winr4r Wyatt|Wor: see "resolve detail very differently" above
06:50 πŸ”— * chronomex convenes a subcommittee to define the ANSI Standard Pel
06:50 πŸ”— * chronomex defines ANSI Standard Pel == ANSI Standard Film Grain
06:53 πŸ”— shaqfu And this is why I don't go anywhere near video preservation
06:53 πŸ”— winr4r Wyatt|Wor: the longer version of that being that film never really runs out of resolution, it gradually resolves details less distinctly as they get finer
06:54 πŸ”— winr4r Wyatt|Wor: digital resolves things 100% sharply until you hit its resolution limit
06:55 πŸ”— chronomex I love the smell of nerd jihad in the morning! it smells like VICTORY.
06:55 πŸ”— chronomex winr4r: no. pels are *sampling*
06:58 πŸ”— winr4r chronomex: honestly, i hate the whole "film vs digital" thing because 1) nobody will ever agree 2) everyone is wrong 3) the world has decided in favour of digital so you might as well argue about whether the titanic or icebergs were cooler
07:00 πŸ”— winr4r let us eat pot noodles instead
07:02 πŸ”— winr4r Wyatt|Wor: don't be
07:02 πŸ”— * chronomex peers closely
07:02 πŸ”— chronomex yep, pixels.
07:02 πŸ”— winr4r whoops, i was accidentally scrolled up a tiny bit, thanks mouse
07:02 πŸ”— Wyatt|Wor I get that I'm a rank amateur; I'm just interested in this conundrum where the resolution limitations of DCP seem like they'll cause edge cases where it can't hold up to film.
07:02 πŸ”— winr4r so i was responding to something he said earlier
07:02 πŸ”— shaqfu winr4r: The iceberg was cooler; no way the ship was anywhere neat freezing
07:03 πŸ”— winr4r need moar caffeine :/
07:03 πŸ”— winr4r shaqfu: haha
07:03 πŸ”— Wyatt|Wor So where do icebergs go on the Cool Wall?
07:14 πŸ”— winr4r bleh i think i am going to head back to bed
07:14 πŸ”— Wyatt|Wor Sleep well
08:04 πŸ”— Coderjoe http://archive.org/details/stage6-1351710
10:45 πŸ”— Wyatt|Wor Holy crap, it finished! It took about 5100 CPU Minutes, but that grep process finally finished! :D
10:46 πŸ”— Wyatt|Wor morbid curiosity: 1 malloc() hell: 0
10:47 πŸ”— oli time to upgrade from that pentium pro 200?
10:51 πŸ”— Wyatt|Wor oli, no, it's a bug in older versions of grep when you have a unicode locale.
10:52 πŸ”— oli ;P
10:52 πŸ”— oli wasnt srs
10:52 πŸ”— oli anyway what's up?
10:52 πŸ”— Wyatt|Wor Not much. Going home soon.
11:49 πŸ”— SmileyG rawwwwr
11:52 πŸ”— SmileyG winr4r: well the iceburgs were made of ice, so they were atl east 0c
11:52 πŸ”— SmileyG i'd think they were cooler than the titantic at any rate.
12:01 πŸ”— Jaybird11 I'm the one who reported the probably shutdown of Q-audio.netto @textfiles and @archiveteam. Possibly one of several.
12:02 πŸ”— Jaybird11 I have Q-audio posts up through 29479, the last numericly-indexed one. Well over 100 gigs.
12:03 πŸ”— Jaybird11 That leaves out probably nearly a year of content, since he switched to Base36.
12:04 πŸ”— Jaybird11 He is known to fight scrapers, including probably useragent blocking and IP blocking. File structure is quite simple.
12:10 πŸ”— Nemo_bis If I 7z a files to a non-solid 7z archive with a different compression rate than the previously used one, will the new option be respected for the new files or not?
12:10 πŸ”— Nemo_bis maybe alard knows
12:12 πŸ”— SketchCow Jaybird11: He's never going to go for it.
12:12 πŸ”— Jaybird11 Yeah I know. Worth the effort though anyway?
12:12 πŸ”— SketchCow I mean, I'm happy to play the part of white knight and make it easy enough to do.
12:13 πŸ”— SketchCow We can do a distributed attack
12:13 πŸ”— SketchCow But he'll just nail those
12:13 πŸ”— Jaybird11 Okay, here's the file structure info
12:13 πŸ”— SketchCow I'd say don't dump it here unless it's short.
12:14 πŸ”— Jaybird11 http://q-audio.net/i/XXX are info pages giving uploaded filename, timestamp submitted, size, etc.
12:14 πŸ”— SketchCow I had no idea you had a good amount yourself.
12:14 πŸ”— Jaybird11 http://q-audio.net/d/XXX are the files themselves. The server returns the XXX and not the real filename so you need the /i/XXX to find the filename.
12:14 πŸ”— SketchCow Want me to give YOU an upload slot? :)
12:15 πŸ”— Jaybird11 Sure. I don't have the info pages though, just the files. But the scraper I used did preserve the filenames.
12:15 πŸ”— SketchCow Man, how crazy is it that I hear "over 100 gigs" and I go "oh! Well, just dump that shit here."
12:15 πŸ”— Jaybird11 The files start numericly and go up through 29479. Then he switched to base36 and I never found a scraper to deal with that.
12:16 πŸ”— SketchCow As if someone mentioned it was an attachment they could mail
12:16 πŸ”— Jaybird11 The entire collection I think is over 300 gigs
12:16 πŸ”— Jaybird11 The reason for the shutdown is, he's tired of Dreamhost and can't find storage as cheap
12:17 πŸ”— Jaybird11 The scraper I used did not preserve the timestamps of the files so if you want those someone will have to scrape the info pages.
12:17 πŸ”— Jaybird11 At least he gave us some warning rather than pulling the plug suddenly.
12:18 πŸ”— SketchCow Big deal, if he's preventing download.
12:19 πŸ”— Jaybird11 A distributed attack is probably the only way to even hope for a full archival. Problem is, I don't know how much time we have left, probably nobody does.
12:20 πŸ”— Jaybird11 My collection of files is on Windows. If you'll want me to use Rsync, I'll need instructions for doing that on Windows or something.
12:21 πŸ”— SketchCow OK, so two things.
12:21 πŸ”— SketchCow 1. My hope is that showing I'm wearing big boy pants and am willing to throw a few bucks his way will persuade him to do the command.
12:21 πŸ”— Jaybird11 In case you didn't know, Q-audio is pretty much a twaud.io alternative this guy created when his Twitter client designed for the blind integrated audio upload support.
12:22 πŸ”— SketchCow 2. I am banking that he is basically agains tbeing randomly scraped by amateurs.
12:22 πŸ”— SketchCow Oh, I am well aware of what this is, you made sure of that months ago.
12:22 πŸ”— SketchCow At least somebody out there is looking out for his blind homeboys.
12:23 πŸ”— Jaybird11 Ah good. I use the service myself. I knew this would probably happen someday.
12:23 πŸ”— Nemo_bis Looks like 7z is smart enough.
12:23 πŸ”— Jaybird11 I pushed a few of my friends to update the existing scraper to support base36, but nobody ever did as far as I know.
12:25 πŸ”— SketchCow Coderjoe: I've added your next 311 videos to stage6.
12:25 πŸ”— Jaybird11 On a related topic, is anyone proactivly archiving Soundcloud or Audioboo?
12:25 πŸ”— SketchCow Coderjoe: Took one command, just did it, so that's how easy it is.
12:26 πŸ”— winr4r good afternoon
12:29 πŸ”— Jaybird11 If we can't get his cooperation, one way to salvage at least a piece of Q-audio other than what I already grabbed would be to call out to people who have downloaded their favorite clips, or who still have stuff they've uploaded. I have most if not all of what I've uploaded myself, and in the case of things I made myself, I have it in lossless format to boot.
12:33 πŸ”— Jaybird11 I think one reason he's against archiving this stuff is that probably a lot of it was sent in direct messages between two individuals. There's really no way to filter out those posts. Posts recorded within Qwitter and its forks start with tmp, so filtering those would probably get rid of a lot of private stuff. But probably not all, and there's always the chance of filtering out something which might mean something to someone years do
12:35 πŸ”— SketchCow RIght
12:36 πŸ”— SketchCow I'm aware, that is in fact what's going to kill it.
12:36 πŸ”— SketchCow I'm going in the front door here
12:36 πŸ”— SketchCow I never think that works.
12:39 πŸ”— Jaybird11 With your experience with Dreamhost, do you have any clues? If he cancels, does he have to sit out the rest of the month, then on May 1 it all goes boom? Or can he pull the plug on Dreamhost anytime he wants? This is all not withstanding his ability to get sick and tired of amateur scrapers and rm -rf * the whole mess and be done with it.
12:40 πŸ”— SketchCow He can boom at anytime
12:40 πŸ”— SketchCow I don't think he will.
12:40 πŸ”— SketchCow His personal policing of the scraping is adorable and unneeded
12:40 πŸ”— Jaybird11 I think what prompted this was, last night a VPS was down and so was the control panel. I think he's had it with outages.
12:42 πŸ”— Jaybird11 On the subject of distributed archival. I've always thought it would be neat to have a distributed system people could run that just sits there, doing whatever ArchiveTeam wants. Sort of an opt-in botnet if you will. People could specify soft and hard limits for disk and bandwidth they're willing to donate to the cause, and also see what projects are running and exclude any they don't want to participate in for some reason.
12:43 πŸ”— winr4r Jaybird11: cow mentioned exactly the same thing in his talk at PDA :)
12:43 πŸ”— SketchCow Yeah, we call it Archive@home
12:44 πŸ”— SketchCow It's the logical next step for the universal tracker.
12:44 πŸ”— SketchCow This EXACT moment, I'm just delighted we have the universal tracker.
12:44 πŸ”— SketchCow Requires a little setup, but then whooooboy
12:44 πŸ”— SketchCow I just wish we didn't have to burn so much goodwill on mobileme
12:44 πŸ”— Jaybird11 I know there's a virtual appliance, but that's probably not accessible to the blind since it doesn't have any screen reader or anything, and you have to know what you want it to do.
12:45 πŸ”— SketchCow No, you don't want in on that crap, yet
12:45 πŸ”— SketchCow I suppose we could take a swing at making stuff more accessible, but we're not there yet.
12:46 πŸ”— oli Archive@home?
12:46 πŸ”— Wyatt universal tracker _is_ pretty slick. I was planning on setting it up for 8bc until that crashed into a mountain of scene drama or something.
12:46 πŸ”— oli how about Archive@everydedicatedandcolocatedserverpossible
12:46 πŸ”— Wyatt (I'm in touch with the people who ran it trying to get what dumps are available)
12:46 πŸ”— SketchCow Archive@home is a parody reference to seti@home, the distributed look for shit in space client
12:47 πŸ”— SketchCow Oh, that's right, 8bc exploded, didn't it.
12:47 πŸ”— oli i know :p
12:48 πŸ”— Wyatt SketchCow: Yeah, I'm still not sure what exactly happened, but I've reached out to 2xAA, and through him hopefully Jose will be cooperative.
12:48 πŸ”— Jaybird11 Why I think we would need both soft and hard limits is this. So normal projects, let's say you set a cap of 1TB you're willing to spend on disk. But here comes some new emergency project. Oh look at this! (Insert name of wildly popular service) has been acquired by Yahoo! They're giving the users twenty-four hours to get their junk off or it all goes away! Now your hard limit kicks in, and you start going at this new project like craz
12:48 πŸ”— winr4r wait, 8bc is gone? :/
12:48 πŸ”— winr4r i remember poking around there a little while back, loved it
12:48 πŸ”— Wyatt winr4r: Thaaaat's how it's looking. And I was going to archive it after MobileMe, too. :(
12:49 πŸ”— winr4r :/
12:49 πŸ”— SketchCow Right now, I'm just trying to get off batcave.
12:49 πŸ”— winr4r Wyatt: hey, first pass of the screenshot bot has completed, btw
12:49 πŸ”— SketchCow Once I'm off batcave, I'll be then trying to get another server off archive.org.
12:49 πŸ”— SketchCow But batcave, he asked me THREE MONTHS AGO to get off
12:49 πŸ”— winr4r now to figure out why some pages cause it to hang for no good reason
12:49 πŸ”— SketchCow It's taken THAT LONG to work out the 20tb
12:49 πŸ”— Wyatt winr4r: Those take a while, don't they?
12:50 πŸ”— SketchCow How are those screenshots being generated, anyway.
12:50 πŸ”— winr4r SketchCow: by a python script using python's webkit bindings running in an Xvfb
12:50 πŸ”— SketchCow 213M newsyc-03/
12:50 πŸ”— SketchCow root@teamarchive-0:/2/FTP/tef# du -sh newsyc-03/
12:51 πŸ”— SketchCow That looks a lot like someone did some sort of awesome grab of ycombinator.
12:52 πŸ”— SketchCow tef: Wake up when you get a chance, I want to understand these files before I upload them.
12:52 πŸ”— Jaybird11 Cow, I'd love to be able to upload my Q-audio collection. I've been a bit concerned that, as far as I know, except for the real thing, I have the only, or one of few, copies.
12:53 πŸ”— SketchCow Yeah, since coming to work for the archive, it's scary how differently I think about the whole thing.
12:53 πŸ”— SketchCow Jaybird11: Would an FTP account be better than an rsync?
12:53 πŸ”— Wyatt How's the accessibility of Cygwin?
12:53 πŸ”— Jaybird11 Probably, unless you can instruct me on Windows.
12:54 πŸ”— tef which files ?
12:54 πŸ”— Jaybird11 It's mostly console so pretty good as far as I know. Let me make sure I don't have an rsync.
12:54 πŸ”— tef SketchCow: well, those files are captures of news.ycombinators front page during the sopa blackout, at different times of the day iirc
12:54 πŸ”— Wyatt So what's the story with batcave anyway? Why are we getting booted from it?
12:54 πŸ”— SketchCow It's old style box
12:55 πŸ”— Jaybird11 Nope, don't have rsync.exe. Does a good Windows port exist that I can just download and use?
12:55 πŸ”— SketchCow They want to decomission the box and clear that rack.
12:55 πŸ”— Wyatt Ah, getting decommissioned
12:55 πŸ”— SketchCow Meanwhile, I'm on there like a tenacious old tenant who refuses to move
12:55 πŸ”— tef winr4r: are you doing python pyqt stuff ?
12:55 πŸ”— SketchCow It's a bit of stress for the admin but he's too nice to really confront me
12:55 πŸ”— winr4r tef: it uses GTK
12:55 πŸ”— SketchCow I would just do a straight transfer over to fos, but there's ironically not enough space.
12:56 πŸ”— winr4r (note: it's someone else's script i've altered, i don't really know anything about pygtk either)
12:56 πŸ”— tef winr4r: qt 4.8 uses webkit 2.2 so i'd recommend it over pygtk
12:56 πŸ”— Wyatt So basically, we all owe the admin a pint.
12:56 πŸ”— SketchCow Jaybird11: http://www.aboutmyip.com/AboutMyXApp/DeltaCopyDownloadInstaller.jsp
12:57 πŸ”— SketchCow That is a port but may be a bit much
12:57 πŸ”— tef SketchCow: actually if I recall correctly, the grabs of news.yc should be the front page and all links from that page
12:58 πŸ”— tef so it should have the comments & the articles linked to
12:58 πŸ”— SketchCow I have an idea. Wyatt, you work with jaybird in a private message to get his rsync going.
12:58 πŸ”— Wyatt I'll see what I can do.
13:00 πŸ”— Jaybird11 I've downloaded DeltaCopy. About to unzip.
13:00 πŸ”— Jaybird11 say I'll be away from this window for a bit while I look at it
13:00 πŸ”— winr4r tef: you're probably right, but webkit 1.x is what this box and Wyatt's VPS has
13:00 πŸ”— Jaybird11 Yes, I am using a screen reader
13:00 πŸ”— tef winr4r: ah cool
13:01 πŸ”— tef winr4r: I was meaning to hook up my companies crawler to irc here but I've sorta not had the time yet
13:01 πŸ”— winr4r tef: and i'm screenshotting fortunecity, i don't need to worry about any of the sites using features that only exist in webkit 2.x :P
13:01 πŸ”— tef winr4r: :D
13:01 πŸ”— tef winr4r: yeah there is also http://code.google.com/p/wkhtmltopdf/
13:02 πŸ”— Jaybird11 Okay, I have DeltaCopy installed. Is this basicly a GUI Rsync?
13:02 πŸ”— SketchCow It might be.
13:03 πŸ”— SketchCow But a really basic one so I had hoped your scraper would work
13:04 πŸ”— SketchCow ANYHUB BBC friendster FRIENDSTER-LOGS MANUALS SOPA-NEWSYC SYNTHMANUALS
13:04 πŸ”— SketchCow root@teamarchive-0:/2# ls
13:04 πŸ”— SketchCow archiveteamorg-dir.xml.xz BERLIOS FRIENDSTER GOOGLEGROUPS MOBILEME-SETS SPLINDER thenews
13:04 πŸ”— SketchCow archiveteamorg-grp.xml.xz DNA friendster-grab.zip MAGAZINES SOPA-GRAB STUFF YAHOOVIDEO
13:04 πŸ”— SketchCow Ok, so there we go, the sort of roundup of data that was on batcave.
13:05 πŸ”— SketchCow Some of those are A TAD LARGE
13:05 πŸ”— Wyatt Wow, Yahoo Video is still hanging around undigested?
13:05 πŸ”— SketchCow A directory is.
13:05 πŸ”— SketchCow root@teamarchive-0:/2/FRIENDSTER# du -sh .
13:05 πŸ”— SketchCow 1.7T
13:06 πŸ”— Jaybird11 Sorry, I don't know how to do private messages in IRC. Do I have a server IP address or something to put into DeltaCopy?
13:06 πŸ”— Jaybird11 say Or a hostname? It asks for a hostname and a virtaul directory name
13:07 πŸ”— Jaybird11 Also, sorry for typing say. I'm used to MUD/MOO systems where you actually have to type say before your text
13:07 πŸ”— Wyatt Jaybird11: You can use /msg username or to open a ...I guess it's like a private channel with /query username
13:11 πŸ”— Wyatt SketchCow: Where's he sticking this stuff?
13:12 πŸ”— SketchCow fos.textfiles.com::qaudio
13:12 πŸ”— SketchCow Oh man, this friendster thing is going to be a huge mess. :)
13:14 πŸ”— winr4r what happened?
13:14 πŸ”— Wyatt Well, we happened.
13:14 πŸ”— winr4r http://lavender.fortunecity.com/powell/58/
13:14 πŸ”— winr4r also, will someone tell me if that loads for them?
13:15 πŸ”— Wyatt Yes.
13:15 πŸ”— winr4r this one quite dependably causes the script to crash
13:15 πŸ”— winr4r okay
13:15 πŸ”— winr4r well, not "crash", just sit there forever
13:15 πŸ”— Ymgve that page...crashed my Opera
13:15 πŸ”— Wyatt Wait. What?
13:15 πŸ”— winr4r Ymgve: HM
13:16 πŸ”— Ymgve for some reason it works now tho
13:21 πŸ”— SketchCow OK, this is actually not as bad as I made it out.
13:22 πŸ”— SketchCow I pulled out of mothballs the infrastructure for importing Friendster and once I did that, things are clicking into place.
13:22 πŸ”— winr4r excellent :)
13:25 πŸ”— SketchCow Most importantly, I had a program called The Renamerator which allows me to keep a consistent naming for these friendsters.
13:25 πŸ”— SketchCow And this has shown a massive missing set of these files.
13:25 πŸ”— SketchCow So that's good.
13:26 πŸ”— SketchCow I think the next thing archiveteam wise is we need some programs written to pull down these files, do some hardcore analysis on them, and then upload those analysis files into the items.
13:28 πŸ”— SketchCow OK, now to use the renamerator on the friendster files, all the rest have been tucked in.
13:29 πŸ”— SketchCow This is how it works, for education:
13:29 πŸ”— SketchCow -rw-r--r-- 1 jscott jscott 1300397206 2011-06-26 11:13 friendster.1000001-1009999.tar.bz2
13:29 πŸ”— SketchCow So, what's the tagline at the end.... like tar.gz or tar.bz2
13:29 πŸ”— SketchCow root@teamarchive-0:/2/FRIENDSTER# sh renamerator
13:29 πŸ”— SketchCow tar.bz2
13:29 πŸ”— SketchCow -------
13:29 πŸ”— SketchCow What's the middle piece, the XXXXXXXXX-XXXXXXXXX.
13:29 πŸ”— SketchCow VVVVVVVVV-VVVVVVVVV
13:29 πŸ”— SketchCow 001000001-001009999
13:29 πŸ”— SketchCow -rw-r--r-- 1 jscott jscott 1300397206 2011-06-26 11:13 friendster.001000001-001009999.tar.bz2
13:30 πŸ”— SketchCow -rw-r--r-- 1 jscott jscott 1671043953 2011-06-26 12:28 friendster.1010000-1019999.tar.bz2
13:30 πŸ”— SketchCow So you see it took the file and helped me rename it so that leading zeros and length were consistent.
13:30 πŸ”— * winr4r nods.
13:30 πŸ”— BlueMax Good
13:30 πŸ”— SketchCow I should say that last line was it showing me the next to do.
13:31 πŸ”— SketchCow Anyway, this is helpful, since I have 79 of these to do.
13:31 πŸ”— SketchCow Then I have to see about uploading them.
13:36 πŸ”— winr4r are they huge?
13:39 πŸ”— Jaybird11 Okay it looks like I have around 1984 files syncing. Wanted to do a test set first
13:41 πŸ”— Jaybird11 This is going to take forever, my upload isn't the fastest
13:54 πŸ”— SmileyG jackdaniels
13:54 πŸ”— SmileyG yum yum
13:54 πŸ”— SmileyG everyone should raise a glass
13:56 πŸ”— * winr4r raises cup of tea
13:56 πŸ”— emijrp hay guise
13:57 πŸ”— winr4r hi emijrp
14:08 πŸ”— alard So, if I see it correctly, there are "1NZ1".to_i(36) => 77725 new-style items on q-audio.net?
14:09 πŸ”— alard Perhaps it shouldn't be that hard to download, if the front door stays shut.
14:09 πŸ”— Jaybird11 It wouldn't be except that he's fighting scrapers.
14:10 πŸ”— Wyatt I don't believe that's ever stopped us before.
14:10 πŸ”— alard You could coordinate that.
14:10 πŸ”— winr4r heh
14:10 πŸ”— winr4r i don't get why he would be blocking scrapers
14:10 πŸ”— alard One scraper at a time, at full speed, then someone else continues when it is blocked.
14:10 πŸ”— Jaybird11 I assume he's paying for bandwidth and doesn't want everyone sucking it up
14:10 πŸ”— winr4r Jaybird11: dreamhost is "unlimited"
14:11 πŸ”— winr4r i think
14:11 πŸ”— Wyatt If he's on Dreamhost, he's got at least a couple TB. They don't offer less, last I looked.
14:13 πŸ”— alard Dreamhost might make it even easier: just hack in and rsync everything out. :) (Not the way to go, obviously, but if you look at the spam problem hacking Dreamhost can't be that hard.)
14:14 πŸ”— closure "I have a complete archive of the Well" -- Waxy
14:30 πŸ”— LucianT Testing.
14:31 πŸ”— emijrp Testing.
14:38 πŸ”— SketchCow We have forever
14:45 πŸ”— BlueMax Testing our love
14:47 πŸ”— Jaybird1 Testing.
14:47 πŸ”— Jaybird1 Yup that works
14:48 πŸ”— SketchCow http://www.youtube.com/watch?v=yVJnMj2oKfo
14:48 πŸ”— Jaybird1 This is Jaybird11 using a different client, actually through a MOO
14:51 πŸ”— Jaybird1 God my Q-audio rsync of the stuff I have is going to take forever
15:02 πŸ”— SketchCow drwxr-xr-x 2 root root 4096 2012-04-15 06:27 FRIENDSTER-059000000
15:02 πŸ”— SketchCow -rw-r--r-- 1 jscott jscott 9720540573 2011-06-30 13:17 friendster.059950000-059959999.tar.bz2
15:02 πŸ”— SketchCow -rw-r--r-- 1 jscott jscott 9484202599 2011-06-30 16:00 friendster.059960000-059969999.tar.bz2
15:02 πŸ”— SketchCow -rw-r--r-- 1 jscott jscott 9747123742 2011-06-30 19:21 friendster.059970000-059979999.tar.bz2
15:02 πŸ”— SketchCow -rw-r--r-- 1 jscott jscott 9278985871 2011-06-30 22:07 friendster.059980000-059989999.tar.bz2
15:02 πŸ”— SketchCow -rw-r--r-- 1 jscott jscott 7637687509 2011-07-01 00:37 friendster.059990000-059999999.tar.bz2
15:02 πŸ”— SketchCow drwxr-xr-x 2 root root 4096 2012-04-15 07:59 FRIENDSTER-060000000
15:02 πŸ”— SketchCow drwxr-xr-x 2 root root 4096 2012-04-15 08:00 FRIENDSTER-065000000
15:03 πŸ”— SketchCow As you can see, it's a nice mix
15:04 πŸ”— Jaybird1 I don't know how or I'd create a Q-audio page on the ArchiveTeam wiki
15:07 πŸ”— winr4r is account creation still disabled, SketchCow?
15:07 πŸ”— SketchCow Yeah
15:07 πŸ”— SketchCow Maybe I'll fix that today
15:07 πŸ”— SketchCow Right now, doing friendster, and going out to buy a camera for the defcon documentary.
15:07 πŸ”— * winr4r nods
15:07 πŸ”— winr4r SketchCow: because two isn't enough? :)
15:08 πŸ”— SketchCow My two are for external interviews
15:08 πŸ”— SketchCow for the main thing, I'm buying 5
15:08 πŸ”— winr4r Jaybird1: in any case, if jason fixes account creation later i'll happily do the page for you, if you tell me what you want to go on it
15:08 πŸ”— winr4r five!
15:09 πŸ”— winr4r DSLRs?
15:10 πŸ”— Jaybird1 I guess just the standard ArchiveTeam info blob with URL, project status, etc. Obviously there's no tracker or source code. Later, if we are unable to get a full archive, we could put a callout to those who have private collections of their own or their favorite posts to come forward.
15:10 πŸ”— SketchCow Nah, not dslrs.
15:10 πŸ”— SketchCow Vixia thingy
15:11 πŸ”— winr4r ah
15:13 πŸ”— SketchCow 8.4G FRIENDSTER-025000000
15:14 πŸ”— SketchCow First one dropping in
15:15 πŸ”— Jaybird1 I killed the rsync for a sec to turn off compression of files. Since it's all audio, compression probably isn't accomplishing much and may even be slowing it down. I don't have access to transfer rates or percentages complete.
15:15 πŸ”— SketchCow Exactly
15:15 πŸ”— winr4r SketchCow: huzzah
15:16 πŸ”— Wyatt Haha, if uploads to IA made musical notes, what would 8.4G of friendster sound like?
15:16 πŸ”— SketchCow Jaybird1: So far you've uploaded 410mb.
15:16 πŸ”— Jaybird1 That's it? Oh man my upload is really slow!
15:17 πŸ”— winr4r that's slow?
15:17 πŸ”— SketchCow It's seriously not that bad
15:17 πŸ”— winr4r it actually took me all day to get 1.4gb of stuff to jason
15:17 πŸ”— Jaybird1 I'd have thought I'd have a gig or two up by now. It's been a few hours
15:22 πŸ”— winr4r Wyatt: are you around?
15:23 πŸ”— Wyatt Aye?
15:25 πŸ”— winr4r Wyatt: would you be so kind as to install xwd on the VPS?
15:25 πŸ”— winr4r see if i can figure out wtf is going on with these sites consistently not loading
15:29 πŸ”— Wyatt Done
15:29 πŸ”— Wyatt I think.
15:34 πŸ”— winr4r thanks!
15:37 πŸ”— Jaybird1 Wow there are a lot of these tmp files with totally random and meaningless filenames. Unfortunately, because of the way the service works, even the guy who runs Q-audio doesn't have records of who uploaded each file, except for possible web server logs of IP addresses.
15:52 πŸ”— Jaybird1 I'm going to go eat and do other stuff. Will stay connected though, and my Rsync will keep going.
16:13 πŸ”— Nemo_bis rsync never eats?
16:22 πŸ”— Wyatt Never. Especially after midnight. Moreover, getting rsync wet is ill-advised.
16:57 πŸ”— SketchCow OK, off to the shopping
16:59 πŸ”— winr4r SketchCow: don't forget my 5D mk III!
19:34 πŸ”— alard Hi Insectoid. From twitter I gather that you're probably Mongoose_Q of Q-Audio, right?
19:35 πŸ”— winr4r word, Insectoid
19:36 πŸ”— Insectoid There we go. Yes sorted. I am
19:36 πŸ”— alard Welcome!
19:37 πŸ”— alard You probably want to speak to SketchCow / Jason Scott.
19:37 πŸ”— balrog_ I'm curious, what are you involved with? :)
19:37 πŸ”— winr4r he's out buying things atm
19:37 πŸ”— Insectoid So first, there was Qwitter. It was pretty much the only Twitter client for the blind.
19:38 πŸ”— Insectoid So then, I thought... Blind people, they'd probably like to use Twitter for voice clips! so, I created q-audio, a simple way of uploading voiceclips to share on twitter using the Qwitter client
19:39 πŸ”— balrog_ Ҁ¦and people ended up using it for other stuff?
19:39 πŸ”— Insectoid That was a few years ago using a Dreamhost VPS. Dreamhost has kind of gone to shit, I want to move away, q-audio is the only thing holding me here. It's 304 gigs. It's primarily copyrighted content at this point, 2/3 of it from a simple sql query (voice clips had temporary names created with the python tempfile module so are easy to find)
19:40 πŸ”— Insectoid I thought I'd shut it down. A lot of people (well, 3) protested. so I'm here.
19:41 πŸ”— winr4r how are you sure that it's primarily copyrighted?
19:41 πŸ”— bsmith096 someone said they have a complete WELL archive ?!?
19:41 πŸ”— winr4r (i don't doubt you, i'm curious)
19:41 πŸ”— winr4r bsmith096: waxy.org
19:41 πŸ”— Insectoid Filenames primarily
19:41 πŸ”— winr4r Insectoid: ah
19:42 πŸ”— closure bsmith096: waxy yes
19:43 πŸ”— Jaybird1 I'd like to jump in here. For those not following the Twitter conversation, yes it's true that a good deal of it is probably copyrighted files with no business ever having been uploaded. But there is some real user-generated content there.
19:44 πŸ”— winr4r on the upside, Insectoid gained an awesome and varied music collection!
19:46 πŸ”— Insectoid (u'Thaeme_Mar
19:46 πŸ”— Insectoid ioto_Feat_Heliao-os_Anjos_Choram.mp3',), (u'Faixa_3.mp3',), (u'Saint_Clements_Ch
19:46 πŸ”— Insectoid oir_-_Saint_Clements_Carol.mp3',), (u'um_amor_para_recordar392.mp3',), (u'06._Mu
19:46 πŸ”— Insectoid chos_Quieren.mp3',), (u'Track01.mp3',), (u'CoolSong-NormalSpeed.mp3',), (u'corte
19:46 πŸ”— Insectoid .mp3',), (u'plach.mp3',), (u'novela1.mp3',), (u'hore_linda_tao_linda.mp3',), (u'
19:46 πŸ”— Insectoid Linkin_Park_-_Live_In_Texas_-_With_You_HQ.mp3',), (u'107_-_COMING_AROUND_AGAIN.m
19:46 πŸ”— Insectoid p3',), (u'radioactivo_-_barbie_q.mp3',)]
19:46 πŸ”— Insectoid the last filenames out of the database as they currently stand
19:46 πŸ”— Insectoid >>>
19:48 πŸ”— winr4r Insectoid: yeah, i sympathise
19:49 πŸ”— * SmileyG is always out of the loop
19:49 πŸ”— SmileyG what you backed up?
19:54 πŸ”— Jaybird1 `mat Use headphones
20:02 πŸ”— Insectoid So ... Now what?
20:02 πŸ”— winr4r Insectoid: wait for jason to get back, he'll arrange whatever with you
20:02 πŸ”— Insectoid Ah okay :)
20:03 πŸ”— winr4r he is out buying video cameras
20:03 πŸ”— tsp___ Insectoid: IMO the first step is to disable uploads, so people can't put new stuff into it
23:56 πŸ”— Nemo_bis SketchCow, what do I need to do to be able to upload or move items to the wikiteam collection?
23:57 πŸ”— Nemo_bis I'll need to create several hundreds and it will be quite tedious to change them afterwards.
23:58 πŸ”— chronomex be good idea to upload to own collection, then add it as a subcollection to archiveteam

irclogger-viewer