#archiveteam 2012-02-11,Sat

โ†‘back Search

Time Nickname Message
00:03 ๐Ÿ”— DFJustin looks like we are no longer on the way to destruction
00:06 ๐Ÿ”— kennethre SketchCow: \o/
00:06 ๐Ÿ”— SketchCow root@teamarchive-0:/3/MOBILEME-SETS# rm -rf full-1328865740
00:06 ๐Ÿ”— SketchCow always a good sign
00:06 ๐Ÿ”— balrog DFJustin: ohhh?
00:07 ๐Ÿ”— balrog where's this info from?
00:22 ๐Ÿ”— underscor http://i.imgur.com/YvqhK.jpg
00:36 ๐Ÿ”— emijrp You won a 1KB hard drive.
00:37 ๐Ÿ”— underscor :D
01:25 ๐Ÿ”— DFJustin balrog: http://tracker.archive.org/df.html
01:27 ๐Ÿ”— Coderjoe_ DFJustin: for me, it never changes from "loading" now
01:27 ๐Ÿ”— underscor Oh
01:27 ๐Ÿ”— underscor Sorry
01:27 ๐Ÿ”— underscor My bad
01:27 ๐Ÿ”— underscor Try now
01:29 ๐Ÿ”— Ymgve http://www.metafilter.com/112641/Listening-to-the-past-recorded-on-tin-foil-and-glass-for-the-first-time-in-over-a-century
01:29 ๐Ÿ”— Coderjoe i'm kinda curious about the internals of pubnub
01:29 ๐Ÿ”— kennethre Coderjoe: y u no pusher?
01:29 ๐Ÿ”— Coderjoe eh?
01:30 ๐Ÿ”— kennethre pusherapp
01:30 ๐Ÿ”— kennethre http://pusher.com/
01:30 ๐Ÿ”— Coderjoe I have a private web app I'm working on that could use message push capabilities
01:30 ๐Ÿ”— kennethre much better than pubnub :)
01:30 ๐Ÿ”— kennethre but pubnub has that fantastic song
01:30 ๐Ÿ”— Coderjoe kennethre: i mentioned pubnub because that is what the df tracker was using
01:31 ๐Ÿ”— underscor I love pubnub's song
01:31 ๐Ÿ”— kennethre Coderjoe: I assumed you wrote it
01:31 ๐Ÿ”— kennethre the song is truly the best thing about the product
01:31 ๐Ÿ”— underscor ^
01:31 ๐Ÿ”— kennethre http://www.youtube.com/watch?v=jZgcEj_qKLU
01:31 ๐Ÿ”— underscor It works well though
01:31 ๐Ÿ”— Coderjoe but my problem is that I don't want to use someone elses servers for pushing the messages
01:31 ๐Ÿ”— kennethre check out pusher though
01:31 ๐Ÿ”— kennethre http://rdd-glimpse.heroku.com
01:31 ๐Ÿ”— kennethre damnit they turned it off
01:31 ๐Ÿ”— kennethre nvm
01:32 ๐Ÿ”— Coderjoe and no, I have no special access to batcave or whereever tracker is
01:32 ๐Ÿ”— tef I met the evangelist from pusher this week
01:32 ๐Ÿ”— kennethre Coderjoe: why not?
01:32 ๐Ÿ”— underscor What is rdd glimpse?
01:32 ๐Ÿ”— tef it does seem like a hosted irc server somewhat, but that's nice anyway
01:32 ๐Ÿ”— tef I did call him on the lack of rest in his rest api and he was like 'fair cop'
01:32 ๐Ÿ”— kennethre underscor: it was a realtime feed of every article going through readability รขย€ย”ร‚ย I worked there up until a few months ago
01:32 ๐Ÿ”— kennethre underscor: looks like they turned it off though
01:33 ๐Ÿ”— underscor awww
01:33 ๐Ÿ”— Coderjoe kennethre: why not use someone else's hosted service? lack of trust?
01:33 ๐Ÿ”— underscor That's a bummer :(
01:34 ๐Ÿ”— kennethre Coderjoe: the internet is a hosted service
01:34 ๐Ÿ”— kennethre Coderjoe: paranoia ;)
01:34 ๐Ÿ”— Coderjoe granted, I'm not doing anything that absolutely requires confidentiality
01:34 ๐Ÿ”— Coderjoe but that's a bit different
01:34 ๐Ÿ”— Coderjoe and it isn't paranoia if they really are out to get you
01:35 ๐Ÿ”— Coderjoe (silly ISP logging legislative attempts, SOPA/PIPA/ACTA/etc, yaddayadda)
01:36 ๐Ÿ”— Coderjoe (google's tracking, etc)
01:40 ๐Ÿ”— kennethre meh
01:41 ๐Ÿ”— kennethre lol
01:41 ๐Ÿ”— kennethre my time is more valuble to me
01:41 ๐Ÿ”— kennethre particularly with realtime stuff
01:41 ๐Ÿ”— kennethre it's a queue
01:41 ๐Ÿ”— kennethre if it's data storage, that's a bit different
01:41 ๐Ÿ”— kennethre but that's just me
01:42 ๐Ÿ”— tef we use aws at work because of the storage guarantees from s3
01:44 ๐Ÿ”— SketchCow http://fortressofsolitude.textfiles.com/
01:44 ๐Ÿ”— * SketchCow bows
01:45 ๐Ÿ”— kennethre tef: we use s3 for EVERYTHING at heroku. You can't beat nine 9s of retention.
01:45 ๐Ÿ”— tef yup
01:45 ๐Ÿ”— kennethre tef: when you put data into a heroku postgres server, its also instantly streamed to s3
01:45 ๐Ÿ”— kennethre <3 s3
01:45 ๐Ÿ”— tef hmmmm
01:45 ๐Ÿ”— kennethre the WAL logs
01:45 ๐Ÿ”— tef we should look at heroku more seriously
01:46 ๐Ÿ”— tef but not this release cycle :3
01:46 ๐Ÿ”— Coderjoe SketchCow: hahah
01:46 ๐Ÿ”— Coderjoe so superman is a hoarder?
01:46 ๐Ÿ”— kennethre tef: for what?
01:47 ๐Ÿ”— tef I work on web archiving and web archiving accessories
01:47 ๐Ÿ”— kennethre i knew that
01:47 ๐Ÿ”— kennethre :)
01:48 ๐Ÿ”— tef well we do access servers from postgres for replaying
01:48 ๐Ÿ”— underscor SketchCow: hahahahaha
01:48 ๐Ÿ”— underscor I love it
01:48 ๐Ÿ”— tef we've been looking for hosted postgres because we're lazy people
01:48 ๐Ÿ”— kennethre tef: we're by far the best one http://postgres.heroku.com
01:48 ๐Ÿ”— SketchCow We all know which way it's going for Superman
01:49 ๐Ÿ”— underscor SketchCow: Now I just need a user account >:)
01:49 ๐Ÿ”— kennethre tef: all the plans get 2TB of storage
01:50 ๐Ÿ”— SketchCow You get your own machine
01:50 ๐Ÿ”— SketchCow You are not going on this one
01:50 ๐Ÿ”— SketchCow Just ask for a machine from AB
01:50 ๐Ÿ”— underscor :(
01:50 ๐Ÿ”— underscor I have one, they just won't give me a lot of space to play with
01:50 ๐Ÿ”— SketchCow See?
01:50 ๐Ÿ”— SketchCow Hoarding machines
01:50 ๐Ÿ”— underscor ha
01:50 ๐Ÿ”— SketchCow I knew it
01:51 ๐Ÿ”— underscor I'm just addicted to big storage
01:51 ๐Ÿ”— kennethre I want to get a real disk array
01:51 ๐Ÿ”— kennethre I have a measly 10TB at home :(
01:52 ๐Ÿ”— kennethre pre-redundancy
01:52 ๐Ÿ”— underscor SketchCow: But how am I going to write cool things like my utility to watch disk storage without a user account!
01:52 ๐Ÿ”— underscor :D
01:53 ๐Ÿ”— SketchCow Oh, you mean that script that's constantly slamming the Disk I/O so a webpage can update?
01:53 ๐Ÿ”— kennethre ouch :)
01:53 ๐Ÿ”— underscor It's not constantly slamming the disk IO
01:53 ๐Ÿ”— underscor df -m is not resource intensive
01:54 ๐Ÿ”— underscor Also, it only runs every 2 second
01:54 ๐Ÿ”— underscor s
01:57 ๐Ÿ”— Coderjoe --no-sync do not invoke sync before getting usage info (default)
01:58 ๐Ÿ”— kennethre the process fork is prob more expensive than the call itself
02:02 ๐Ÿ”— SketchCow > Content-Length: 204820131840
02:02 ๐Ÿ”— SketchCow > Expect: 100-continue
02:02 ๐Ÿ”— underscor nice
02:02 ๐Ÿ”— underscor SketchCow: What's the identifier?
02:02 ๐Ÿ”— SketchCow Let it go for a while, it's STILL uploading.
02:03 ๐Ÿ”— tef oh joy 100 continue
02:03 ๐Ÿ”— SketchCow I am worried about this - I think it might be possible kennethre can upload faster than I can output to archive.org.
02:03 ๐Ÿ”— underscor I know, I just want to watch it in the s3 console
02:03 ๐Ÿ”— tef SketchCow: can't kennethre upload directly to archive.org with a s3 token ?
02:03 ๐Ÿ”— underscor (now that sam explained it to me)
02:03 ๐Ÿ”— SketchCow Yes and no.
02:03 ๐Ÿ”— underscor tef: Not with the way the scripts currently work
02:03 ๐Ÿ”— tef if he finds the right magic incantations to put in the headers
02:03 ๐Ÿ”— tef ah
02:04 ๐Ÿ”— SketchCow He COULD, but a script needs to be run a certain way to track how the sets are generated.
02:04 ๐Ÿ”— underscor It would have to be rather kludgy
02:04 ๐Ÿ”— tef I can do kludgy
02:04 ๐Ÿ”— underscor Also, yeah
02:04 ๐Ÿ”— underscor the tracker requires things are done a certain way
02:04 ๐Ÿ”— SketchCow Already, we're doing 200gb sets into archive.org, which is kind of hates.
02:04 ๐Ÿ”— SketchCow We're going to produce many sets at that rate.
02:04 ๐Ÿ”— tef http://i.imgur.com/Lus4Y.png
02:08 ๐Ÿ”— kennethre SketchCow: oh i can scale much much more if you'd like me to prove :)
02:08 ๐Ÿ”— kennethre SketchCow: I'm completely off though. I have been all day
02:09 ๐Ÿ”— kennethre If strait-to-s3 woudl be easy, that'd be *much* preferred, since i'm already on ec2
02:09 ๐Ÿ”— tef archive run a s3 like api
02:09 ๐Ÿ”— underscor Yeah, it's not actually to s3
02:10 ๐Ÿ”— underscor It's to IA's s3-compatible API
02:10 ๐Ÿ”— underscor kennethre: But you'd just clog up the API
02:10 ๐Ÿ”— underscor It's slower than even batcave is
02:10 ๐Ÿ”— SketchCow Yeah, I might experiment with FTP.
02:10 ๐Ÿ”— underscor SketchCow: I know you personally like using the regular public things, but you will get a LOT faster performance if you send your own contrib_submit
02:10 ๐Ÿ”— SketchCow Just to see if that machine is faster.
02:10 ๐Ÿ”— underscor It will do direct rsync from batcave to the destination petabox
02:11 ๐Ÿ”— SketchCow Not interested.
02:11 ๐Ÿ”— underscor Okay
02:11 ๐Ÿ”— SketchCow Improve S3, don't support internal-only hacks
02:11 ๐Ÿ”— SketchCow Because then they own you
02:11 ๐Ÿ”— kennethre underscor: i thought you meant actual s3
02:11 ๐Ÿ”— underscor I use it for the 400GB daily stuff I've been stuffing in
02:11 ๐Ÿ”— underscor Well, use a client that supports chunked transfers then
02:11 ๐Ÿ”— underscor We added that to the API
02:12 ๐Ÿ”— underscor Then you'll get amazing performance
02:12 ๐Ÿ”— SketchCow What clients.
02:12 ๐Ÿ”— underscor Basically axel in reverse
02:12 ๐Ÿ”— SketchCow Curl?
02:12 ๐Ÿ”— tef underscor: why would chunked transfers make a difference though ?
02:12 ๐Ÿ”— underscor Sorry
02:13 ๐Ÿ”— underscor Not chunked, multipart
02:13 ๐Ÿ”— underscor SketchCow: No, curl doesn't
02:13 ๐Ÿ”— underscor I'm looking to see if s3cmd does
02:13 ๐Ÿ”— tef hmm
02:13 ๐Ÿ”— underscor one moment
02:13 ๐Ÿ”— tef I mean if you can do appending (unlike amazon s3 iirc) that would be nice I guess
02:13 ๐Ÿ”— underscor fuck me this computer is slow
02:13 ๐Ÿ”— underscor Yeah, you can do resuming
02:13 ๐Ÿ”— underscor But the biggest boon is that it's parallel
02:14 ๐Ÿ”— underscor So you get wonderful performance
02:14 ๐Ÿ”— tef aaah I see
02:14 ๐Ÿ”— kennethre what's wrong with the rsync situtation now?
02:15 ๐Ÿ”— SketchCow The main problem right now is that I have a LOT of things goin on batcae right now.
02:15 ๐Ÿ”— underscor SketchCow: https://gist.github.com/977597
02:15 ๐Ÿ”— kennethre so we need more batcaves :)
02:15 ๐Ÿ”— underscor Needs slight modification to point to ias3 though
02:15 ๐Ÿ”— Coderjoe huh
02:15 ๐Ÿ”— kennethre our automated migrations off of batcave
02:15 ๐Ÿ”— Coderjoe contrib_submit doesn't look like internal-only O_O
02:15 ๐Ÿ”— kennethre *or
02:16 ๐Ÿ”— underscor Coderjoe: If you submit a job it will say "nope"
02:17 ๐Ÿ”— underscor <result type="error" code="internal_error"><message>external caller</message></result>
02:17 ๐Ÿ”— underscor http://www.archive.org/contrib_submit.php
02:18 ๐Ÿ”— Coderjoe it's happily showing me the help at least
02:18 ๐Ÿ”— tef underscor: iirc don't you need additional headers for ia s3 ?
02:18 ๐Ÿ”— underscor Yeah, help is public
02:18 ๐Ÿ”— underscor tef: Yes
02:19 ๐Ÿ”— underscor That's what I was saying
02:19 ๐Ÿ”— underscor Needs slight modification
02:19 ๐Ÿ”— underscor "Content-Type" : content_type,
02:19 ๐Ÿ”— underscor # Metadata that we need to pass in before attempting an upload.
02:19 ๐Ÿ”— underscor basic_headers = {
02:19 ๐Ÿ”— underscor content_type = guess_type(local_file, False)[0] or "application/octet-stream"
02:19 ๐Ÿ”— underscor }
02:19 ๐Ÿ”— tef aaah
02:19 ๐Ÿ”— tef oh btw I made an example crawler that makes warcs and uses requests https://github.com/tef/crawler. the pipelining is noice.
02:20 ๐Ÿ”— kennethre tef: damn you're faster than me :)
02:20 ๐Ÿ”— tef https://github.com/tef/crawler
02:20 ๐Ÿ”— underscor But it supports chunking natively, so you'll likely get >100mbps
02:20 ๐Ÿ”— kennethre tef: quite excellent รขย€ย”ร‚ย notice a big speedup w/ the auto keep-alive?
02:20 ๐Ÿ”— tef yeah I kinda stayed up till 9am that night
02:20 ๐Ÿ”— tef kennethre: I had a code sample of a crawler and a library for warcs, oh and requests helped :-)
02:21 ๐Ÿ”— kennethre hehe
02:21 ๐Ÿ”— kennethre <3
02:21 ๐Ÿ”— tef most of the changes were going 'fuck it it doesnt need to be in a bunch of different files'
02:21 ๐Ÿ”— tef that said
02:21 ๐Ÿ”— tef I think I pasted you where I recreate the http messages from accessors
02:22 ๐Ÿ”— kennethre yeah
02:22 ๐Ÿ”— tef makes me feel wrong and dirty inside
02:22 ๐Ÿ”— kennethre i need to add an iteritems() method on CaseInsensitveDict
02:22 ๐Ÿ”— kennethre oh those
02:22 ๐Ÿ”— kennethre gotcha
02:22 ๐Ÿ”— kennethre different person :)
02:23 ๐Ÿ”— tef https://github.com/tef/crawler/blob/master/crawler.py#L107
02:23 ๐Ÿ”— tef literally I want to make that bit pretty i.e a sort of raw-ish http output
02:24 ๐Ÿ”— tef one day *dreams*
02:24 ๐Ÿ”— kennethre one thing at a time :)
02:25 ๐Ÿ”— tef one day there will be a http library with parsers that aren't intertwined with sockets
02:25 ๐Ÿ”— kennethre hah, that's hilarious
02:25 ๐Ÿ”— tef (and there is one, I wrote it for warc processing)
02:25 ๐Ÿ”— tef hooray! code reuse!
02:25 ๐Ÿ”— kennethre yeah, the standard lib can be a bit cancerous at times
02:25 ๐Ÿ”— tef urilib3000
02:26 ๐Ÿ”— kennethre which is the main thing that's keeping me from putting requests into the standard lib
02:26 ๐Ÿ”— tef oh god don't do that
02:26 ๐Ÿ”— kennethre i know, right?
02:27 ๐Ÿ”— kennethre found a great quote today
02:27 ๐Ÿ”— kennethre "Bundling into core Python requires a package to be essentially stable, i.e., dead."
02:27 ๐Ÿ”— tef might as well put your code on sourceforge
02:27 ๐Ÿ”— kennethre nah, it'd get tons of use
02:27 ๐Ÿ”— kennethre and it would do a huge service to the community
02:28 ๐Ÿ”— tef is it finished though ?
02:28 ๐Ÿ”— kennethre nah, i have a grad student working on oauth
02:28 ๐Ÿ”— tef I mean, can you build a s3 library on top of requests ?
02:28 ๐Ÿ”— kennethre that's the major thing i want to get before 1.0
02:28 ๐Ÿ”— kennethre of course
02:28 ๐Ÿ”— kennethre boto is working actively to move to requests :)
02:28 ๐Ÿ”— tef so you do 100-expect on post messages with a specific size ?
02:29 ๐Ÿ”— kennethre oh so that's the other huge thing
02:29 ๐Ÿ”— kennethre when i dropped urllib2, i lost streaming uploads
02:29 ๐Ÿ”— kennethre so that needs to happen.
02:29 ๐Ÿ”— kennethre those are the two major peices
02:29 ๐Ÿ”— kennethre won't be too hard
02:30 ๐Ÿ”— tef then beautiful soup needs to be included :v
02:30 ๐Ÿ”— kennethre NO
02:30 ๐Ÿ”— * kennethre stabs tef
02:30 ๐Ÿ”— tef haha
02:30 ๐Ÿ”— kennethre so many people want me to add content-type-decoding
02:31 ๐Ÿ”— kennethre "why do i have to deserialize my json?"
02:31 ๐Ÿ”— tef I threatened to burn down a coworkers house for pushing something into a sprint midway
02:31 ๐Ÿ”— kennethre stfu
02:31 ๐Ÿ”— kennethre haha
02:31 ๐Ÿ”— tef kennethre: because actually a generalized mechanize like thing would be an obvious thing to build atop
02:31 ๐Ÿ”— kennethre yeah i'd love to replace mechanize
02:31 ๐Ÿ”— kennethre though multi-mechanize is pretty nice now
02:31 ๐Ÿ”— tef my boss said to him 'this is reality. not fantasy. things won't get done. '
02:31 ๐Ÿ”— kennethre lol
02:32 ๐Ÿ”— tef in a different convo
02:32 ๐Ÿ”— tef the other boss also said 'we need to have consensus and not fuck with the sprint' essentially
02:32 ๐Ÿ”— tef I like my job \o/
02:32 ๐Ÿ”— kennethre sounds like you like scrum :)
02:33 ๐Ÿ”— tef although it's been a year long argument to get this sort of fortnight driven releses
02:33 ๐Ÿ”— tef I dunno if it is scrumm
02:33 ๐Ÿ”— tef it's more like ' version numbers are now a measure of time, not features'
02:33 ๐Ÿ”— tef we work out the priorities every two weeks and work how much time we're willing to *spend* on features, not estimates
02:33 ๐Ÿ”— tef thing is as a result we actually know what is getting fixed and done
02:34 ๐Ÿ”— kennethre i hate anything that's not autonomous :)
02:34 ๐Ÿ”— tef haha, well we are somewhat autonomous
02:34 ๐Ÿ”— tef this is where we get together with business and ensure we know what t he fuck is happening
02:34 ๐Ÿ”— tef with clients, contracts, sales, etc
02:35 ๐Ÿ”— kennethre for-profit archive company?
02:36 ๐Ÿ”— tef yeah
02:36 ๐Ÿ”— kennethre interesting
02:36 ๐Ÿ”— tef compliance archiving mostly
02:36 ๐Ÿ”— kennethre how does that work?
02:36 ๐Ÿ”— tef but we do research grants too for our academic bit on the side
02:36 ๐Ÿ”— tef it's either brand heritage
02:36 ๐Ÿ”— tef or things like sarbanes oxley or ftc rules about archiving the fuck out of everything
02:37 ๐Ÿ”— kennethre heh
02:37 ๐Ÿ”— kennethre interesting
02:38 ๐Ÿ”— tef my boss offically supports me trying to do archive team things
02:38 ๐Ÿ”— tef I was planning to get a snapshot bot that takes page warcs from things pasted in here
02:38 ๐Ÿ”— tef in my dubious free time
02:39 ๐Ÿ”— kennethre haha
02:39 ๐Ÿ”— kennethre do it
02:39 ๐Ÿ”— kennethre thats what i was going to build for requests
02:39 ๐Ÿ”— kennethre warcify([response, response, response])
02:39 ๐Ÿ”— kennethre i might be able to ship it in requests itself :)
02:40 ๐Ÿ”— Coderjoe you had to name it something that allowed for confusion over subject
02:40 ๐Ÿ”— tef well first I have to finish making this release deployable
02:40 ๐Ÿ”— kennethre Coderjoe: it's a great name :)
02:49 ๐Ÿ”— tef kennethre: next up replace mime handling please
02:49 ๐Ÿ”— kennethre tef: elaborate..
02:49 ๐Ÿ”— tef ah my co-worker complains about the mime library in python
02:51 ๐Ÿ”— kennethre for headers?
02:51 ๐Ÿ”— kennethre i think
02:51 ๐Ÿ”— kennethre oh well it's not even there in python3 anymore
02:51 ๐Ÿ”— kennethre pretty sure they killed it
02:51 ๐Ÿ”— tef \o/
02:54 ๐Ÿ”— kennethre no it's bad
02:54 ๐Ÿ”— kennethre they removed critical functionality
02:54 ๐Ÿ”— kennethre like a way to detect upload boundries
02:54 ๐Ÿ”— kennethre no big deal
02:54 ๐Ÿ”— tef heh
02:54 ๐Ÿ”— kennethre (i fucking hate python 3)
02:54 ๐Ÿ”— tef well breaking things in py3 is ok
02:54 ๐Ÿ”— tef why do you hate it ?
02:55 ๐Ÿ”— kennethre porting requests to it was one of the most difficult things i've ever had to do
02:55 ๐Ÿ”— kennethre as a python dev
02:55 ๐Ÿ”— tef ah
02:55 ๐Ÿ”— tef was it bytes/strings?
02:55 ๐Ÿ”— kennethre the force bytes/str seperation
02:55 ๐Ÿ”— kennethre yes
02:55 ๐Ÿ”— kennethre it's not so bad on its own
02:55 ๐Ÿ”— tef I mean, aren't you ust doing stuff with bytestrings all the time
02:55 ๐Ÿ”— kennethre but supporting 2.x and 3.x at the same time makes it more difficult
02:56 ๐Ÿ”— kennethre no
02:56 ๐Ÿ”— kennethre in 2, you can use bytes or unicode everywhere
02:56 ๐Ÿ”— kennethre in 3.x i just made it so all the non-binary entry points expect unicode
02:56 ๐Ÿ”— kennethre once i did that and seperated resopnse text from content (which was auto unicode-decoded before)
02:56 ๐Ÿ”— kennethre it wasn't *so* bad
02:56 ๐Ÿ”— tef yeah
02:57 ๐Ÿ”— kennethre still, i don't think the way it was was so bad
02:57 ๐Ÿ”— kennethre i never had any problems.
02:57 ๐Ÿ”— tef it sounds more like it shok out a design bug
02:57 ๐Ÿ”— kennethre the real problem
02:57 ๐Ÿ”— kennethre is half the standard lib is broken now
02:57 ๐Ÿ”— tef the real problem is unicode/bytes is awful
02:57 ๐Ÿ”— kennethre the bytes/unicode thing is all over the place
02:57 ๐Ÿ”— kennethre lol yes
02:57 ๐Ÿ”— kennethre http://lucumr.pocoo.org/2011/12/7/thoughts-on-python3/
02:58 ๐Ÿ”— kennethre ^^ I agree with every word of this.
02:58 ๐Ÿ”— tef ah yes
02:59 ๐Ÿ”— tef thing is to me python 3 is a fork of python
02:59 ๐Ÿ”— tef supported by the core devs
02:59 ๐Ÿ”— tef but unlike within 2x I can't from __future__ import the bits I need
02:59 ๐Ÿ”— kennethre I can pretty much agree with that
02:59 ๐Ÿ”— tef I have to rewrite for python 3
03:00 ๐Ÿ”— kennethre we all do
03:00 ๐Ÿ”— kennethre it's a new language
03:00 ๐Ÿ”— tef and at that point, why not ruby or other things ?
03:00 ๐Ÿ”— kennethre and it wasn't needed imo
03:00 ๐Ÿ”— tef thing is, they also broke the abi at the same time
03:00 ๐Ÿ”— tef I mean, backwards incompatible changes are good
03:00 ๐Ÿ”— tef but you need to be able to opt-in early
03:01 ๐Ÿ”— tef at this rate someone could fork 2.8 and get away with it
03:01 ๐Ÿ”— tef fork /a/ 2.8
03:02 ๐Ÿ”— kennethre yes
03:02 ๐Ÿ”— kennethre and that's exactly what I, as a developer, want.
03:02 ๐Ÿ”— kennethre a 2.8
03:02 ๐Ÿ”— tef if I bring this up online I am pretty sure I will get told off for trolling
03:02 ๐Ÿ”— tef because core devs seem quite sensitive about 3
03:04 ๐Ÿ”— kennethre i'll just get the PSF to fund 2.8
03:04 ๐Ÿ”— kennethre lol
03:04 ๐Ÿ”— kennethre no big deal
03:04 ๐Ÿ”— tef haha
03:04 ๐Ÿ”— kennethre pypy can do it
03:04 ๐Ÿ”— tef well if there was a 2.8 which came with python 3 tacked on
03:04 ๐Ÿ”— tef I mean, heh i'll just write a rpc layer with ctypes :v
03:05 ๐Ÿ”— kennethre "If 2to3 is our upgrade path to Python 3, then py2js is the upgrade path to JavaScript."
03:05 ๐Ÿ”— kennethre lawl
03:05 ๐Ÿ”— tef nice
03:05 ๐Ÿ”— kennethre tef: you can't, because they use the same namespace identifier
03:07 ๐Ÿ”— tef kennethre: run it in a seperate process
03:07 ๐Ÿ”— kennethre tef: or just don't run it at all ;)
03:08 ๐Ÿ”— tef heheh
03:11 ๐Ÿ”— tef yeah I am sure talk of 2.8 will get you lynched at pycon
03:14 ๐Ÿ”— kennethre nah, everyone loves me :)
03:14 ๐Ÿ”— SketchCow Going to pycon should get you lynched at pycon
03:14 ๐Ÿ”— kennethre SketchCow should definitely go to pycon
03:14 ๐Ÿ”— SketchCow Ha ha, next century
03:14 ๐Ÿ”— * SketchCow packs
03:15 ๐Ÿ”— kennethre not a fan of the pythons?
03:15 ๐Ÿ”— SketchCow Not a fan of the guido
03:15 ๐Ÿ”— SketchCow And guide is a big influence on the pythons
03:15 ๐Ÿ”— kennethre interesting
03:15 ๐Ÿ”— SketchCow guide/guido
03:15 ๐Ÿ”— kennethre any particular reason?
03:15 ๐Ÿ”— SketchCow He's a jerk?
03:15 ๐Ÿ”— kennethre lol, what'd he do?
03:16 ๐Ÿ”— SketchCow What, I need to point to my stolen lolly and my popped balloon?
03:16 ๐Ÿ”— tef most programmers are jerks though
03:16 ๐Ÿ”— SketchCow Most are, yeah.
03:16 ๐Ÿ”— tef might be observational bias on my account
03:16 ๐Ÿ”— kennethre i've met him, seemed pretty awkward but not confrontational or anything
03:16 ๐Ÿ”— kennethre he's no linus torvalds :)
03:16 ๐Ÿ”— kennethre *that* is a jerk
03:17 ๐Ÿ”— SketchCow Shhh, I'm in his home country
03:17 ๐Ÿ”— SketchCow agents will come
03:17 ๐Ÿ”— kennethre lmao
03:17 ๐Ÿ”— tef perkele
03:17 ๐Ÿ”— kennethre the gits
03:17 ๐Ÿ”— SketchCow adding: 1994/Pinball_Arcade_CD-MSJ/msj_pin6.zip (deflated 0%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Pinball_Arcade_CD-MSJ/msj_pin9.zip (deflated 0%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Pinball_Arcade_CD-MSJ/MSJ!PIN.NFO (deflated 74%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Pinball_Arcade_CD-MSJ/msj_pin1.zip (deflated 0%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Pinball_Arcade_CD-MSJ/msj_pin4.zip (deflated 0%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Pinball_Arcade_CD-MSJ/msj_pin2.zip (deflated 0%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Pinball_Arcade_CD-MSJ/msj_pin7.zip (deflated 0%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Arena_All_Continential_Maps-POLICE/ (stored 0%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Arena_All_Continential_Maps-POLICE/FILE_ID.DIZ (deflated 59%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Arena_All_Continential_Maps-POLICE/POLICE.NFO (deflated 77%)
03:17 ๐Ÿ”— SketchCow adding: 1994/Arena_All_Continential_Maps-POLICE/plc_amap.zip (deflated 0%)
03:17 ๐Ÿ”— tef but yeah if I didn't like software because I didn't like the authors, I wouldn't be left with a lot of software to use
03:17 ๐Ÿ”— SketchCow adding: 1994/Beneath_A_Steel_Sky_GERMAN-UA/ (stored 0%)
03:18 ๐Ÿ”— SketchCow adding: 1994/Beneath_A_Steel_Sky_GERMAN-UA/ua-sky5.zip (deflated 0%)
03:18 ๐Ÿ”— SketchCow adding: 1994/Beneath_A_Steel_Sky_GERMAN-UA/FILE_ID.DIZ (deflated 47%)
03:18 ๐Ÿ”— SketchCow I'd say today is a good day
03:18 ๐Ÿ”— SketchCow Not true
03:18 ๐Ÿ”— SketchCow You'd be left with archiveteam tools
03:18 ๐Ÿ”— SketchCow That's all you need
03:18 ๐Ÿ”— kennethre hehehe
03:18 ๐Ÿ”— Aranje but the linuxes it runs on!
03:18 ๐Ÿ”— kennethre steve jobs was a jerk too
03:18 ๐Ÿ”— kennethre made damn good systems though
03:19 ๐Ÿ”— * Aranje doesn't like them
03:19 ๐Ÿ”— SketchCow No, he pounded human beings into the ground to make them make good enough systems
03:19 ๐Ÿ”— SketchCow Also, Archive Team is the Yelling Bird of Programming Guilds
03:19 ๐Ÿ”— underscor haha
03:19 ๐Ÿ”— SketchCow http://questionablecontent.wikia.com/wiki/Yelling_Bird
03:20 ๐Ÿ”— kennethre SketchCow: I can agree with that :)
03:20 ๐Ÿ”— tef I made this t-shirt about 7 years ago http://printf.net/~tef/photos/tshirt/tef_tourette.jpg
03:21 ๐Ÿ”— tef but in scotland, cunt is a more affectionate term than offensive
03:21 ๐Ÿ”— SketchCow http://www.indietits.com/comics/blogs.png
03:21 ๐Ÿ”— tef sometimes :v
03:22 ๐Ÿ”— underscor tef: hahaha
03:22 ๐Ÿ”— underscor oh noes, down to 800GB
03:23 ๐Ÿ”— kennethre SketchCow: the reason I asked about guido is because I've never heard anyone say that about him before :)
03:23 ๐Ÿ”— SketchCow http://www.questionablecontent.net/indietits/comics/wowagain.png
03:24 ๐Ÿ”— SketchCow We're kind of fucked, I'm trying to get files off of batcave and I am up at 6am and the bus leaves at 9:30am and then I'm on a plane for 10 hours.
03:24 ๐Ÿ”— SketchCow No real solution.
03:24 ๐Ÿ”— kennethre hopefully a wifi plane then :)
03:25 ๐Ÿ”— kennethre I use too many emoticons
03:25 ๐Ÿ”— kennethre o(O.o)o
03:25 ๐Ÿ”— SketchCow Nah, we're fucked, we're bringing it all in too fast.
03:26 ๐Ÿ”— underscor Solution: Give me sudo, and I can help load into IA
03:26 ๐Ÿ”— underscor :D
03:26 ๐Ÿ”— SketchCow Yeah, wait, let me check my... NO
03:26 ๐Ÿ”— underscor Especially since it's friday night so I have no bedtime!
03:27 ๐Ÿ”— underscor SketchCow: Aww :(
03:27 ๐Ÿ”— underscor Or just chmod them then :P
03:27 ๐Ÿ”— tef well I think trying to work out a way just to push it to ia directly will be faster than batcave to ia
03:27 ๐Ÿ”— SketchCow No such solutio.
03:27 ๐Ÿ”— underscor ^
03:27 ๐Ÿ”— SketchCow I'm now seeing if I can get google groups loaded over.
03:28 ๐Ÿ”— Aranje lol underscor you authwhore
03:28 ๐Ÿ”— underscor Authwhore?
03:28 ๐Ÿ”— underscor :P
03:28 ๐Ÿ”— Aranje authwhore.
03:28 ๐Ÿ”— Aranje You heard me :P
03:28 ๐Ÿ”— tef no sane person asks for root
03:28 ๐Ÿ”— underscor hahaha
03:28 ๐Ÿ”— * underscor is not sane
03:28 ๐Ÿ”— tef root is a burden
03:28 ๐Ÿ”— Aranje haha
03:28 ๐Ÿ”— underscor true
03:28 ๐Ÿ”— Aranje I share root on all my servers
03:28 ๐Ÿ”— underscor tef: very true
03:28 ๐Ÿ”— tef it's going 'yes blame me for everything and I have to fix shit'
03:29 ๐Ÿ”— kennethre he didn't ask for root, only sudo :P
03:29 ๐Ÿ”— underscor Also, yes
03:29 ๐Ÿ”— SketchCow You guys are confusing "Is it progromatically possible to transfer bits into the internet archive servers" and "what are the procedures to politically fit into archive.org's storage paradigm"
03:29 ๐Ÿ”— SketchCow I know all of you can do the first.
03:29 ๐Ÿ”— tef kennethre: crack instead of meth
03:29 ๐Ÿ”— SketchCow Right now I am working on the second, only I can bridge that.
03:29 ๐Ÿ”— tef ah
03:29 ๐Ÿ”— underscor oic
03:29 ๐Ÿ”— SketchCow root@teamarchive-0:/3# du -sh MOBILEME
03:29 ๐Ÿ”— SketchCow 3.5T MOBILEME-SETS
03:29 ๐Ÿ”— SketchCow 398G MOBILEME
03:29 ๐Ÿ”— SketchCow du -sh *SETS
03:29 ๐Ÿ”— SketchCow root@teamarchive-0:/3# du -sh *SETS
03:29 ๐Ÿ”— tef well I figure the technical barriers are easy to solve and I can help with that
03:30 ๐Ÿ”— SketchCow Anyway, as we see there, it's not necessarily mobileme filling that drive (11tb)
03:30 ๐Ÿ”— SketchCow Let me see if I can get something going.
03:30 ๐Ÿ”— underscor SketchCow: Not me anymore either
03:30 ๐Ÿ”— tef but yeah social problems are always the annoying ones in programming, I have no idea :v
03:30 ๐Ÿ”— underscor I have ~2GB on that drive now
03:30 ๐Ÿ”— underscor I'm clearing out everything else I can too
03:31 ๐Ÿ”— SketchCow I am sure it's googlegroups.
03:31 ๐Ÿ”— underscor btw, mailing drives next monday or tuesday
03:31 ๐Ÿ”— kennethre We should just start a large storage-as-a-service company and abuse it
03:31 ๐Ÿ”— underscor so they should be there in time for you
03:32 ๐Ÿ”— SketchCow Already am, dude
03:32 ๐Ÿ”— SketchCow it's called archive.org
03:32 ๐Ÿ”— tef heheheh
03:32 ๐Ÿ”— underscor hahaha
03:32 ๐Ÿ”— kennethre without the social problem :)
03:32 ๐Ÿ”— tef kennethre: no if it has people it has social problems
03:32 ๐Ÿ”— SketchCow if it has any of you assholes it has social problems
03:32 ๐Ÿ”— underscor ^
03:32 ๐Ÿ”— kennethre haha
03:32 ๐Ÿ”— SketchCow It's like firing a frathouse into a nunnery
03:33 ๐Ÿ”— underscor :D
03:33 ๐Ÿ”— Aranje :D
03:33 ๐Ÿ”— underscor that's an awesome analogy
03:33 ๐Ÿ”— tef there are some pisshead archivists i've met
03:33 ๐Ÿ”— tef there was this awesome chap who does the digital stuff for nz library
03:33 ๐Ÿ”— tef but most of them were quite straight laced
03:37 ๐Ÿ”— tef although it seems how you start a fight with an archivist is by arguing over naming things
03:37 ๐Ÿ”— kennethre haha
03:38 ๐Ÿ”— tef I can't put any standard bits into warctools because every archive is *special* and *unique*
03:38 ๐Ÿ”— tef every single institution. file names, id numbersm serial numbers. compression settings
03:38 ๐Ÿ”— tef oh: and I want to find who made the ARC format and kill them
03:39 ๐Ÿ”— tef you have to parse the body of the first record to parse the headers of the first and subsequent records *head explodes*
03:40 ๐Ÿ”— kennethre http itself is already a pretty good format :)
03:40 ๐Ÿ”— tef almost
03:40 ๐Ÿ”— kennethre just need some metadata around it
03:41 ๐Ÿ”— tef iso-8859-1
03:41 ๐Ÿ”— tef there were fiths about using mime
03:41 ๐Ÿ”— kennethre mime can die in a fire
03:41 ๐Ÿ”— tef warc needs a transfer-chunked style thing
03:41 ๐Ÿ”— kennethre parsing http headers properly is something that never happens
03:41 ๐Ÿ”— tef so I don't need to keep them in memory
03:42 ๐Ÿ”— kennethre (header values)
03:42 ๐Ÿ”— tef kennethre: sort of
03:43 ๐Ÿ”— tef omst people don't produce them correctly
03:44 ๐Ÿ”— tef gah, typing with lag ruins my ability to type
03:45 ๐Ÿ”— kennethre irssi + slow ssh?
03:45 ๐Ÿ”— tef yeah
03:45 ๐Ÿ”— tef joy
03:45 ๐Ÿ”— tef hmm
03:45 ๐Ÿ”— kennethre you can do local buffering
03:45 ๐Ÿ”— kennethre well, not with curses nvm
03:46 ๐Ÿ”— tef_ now i'm webscale
03:46 ๐Ÿ”— tef nepotism in action
03:47 ๐Ÿ”— tef but yeah warc is 'not bad' not great by any means
03:50 ๐Ÿ”— SketchCow underscor: What's that countdown page again?
03:50 ๐Ÿ”— underscor http://tracker.archive.org/df.html
03:50 ๐Ÿ”— underscor Except it's a count-up
03:50 ๐Ÿ”— underscor :D
03:50 ๐Ÿ”— kennethre haha
03:50 ๐Ÿ”— kennethre it's a countup :)
03:51 ๐Ÿ”— SketchCow Yeah, take that.
03:52 ๐Ÿ”— * kennethre spins back up
03:52 ๐Ÿ”— SketchCow I still don't want the kennethre horn turned back on, but I found what I was doing wrong somewhere.
03:52 ๐Ÿ”— SketchCow I can get a better grip on things shortly.
03:52 ๐Ÿ”— underscor awesome
03:52 ๐Ÿ”— kennethre awesome :)
03:53 ๐Ÿ”— SketchCow The buffering issue is still there, but at least the space for the slowpokes can be there while I get the uploading more automated.
03:56 ๐Ÿ”— SketchCow This machine is being so hammered. It's doing a straight rm and it's STILL going slow.
03:56 ๐Ÿ”— SketchCow Also, there's a tonof files in that directory, I guess.
03:56 ๐Ÿ”— SketchCow Granted, getting a gigabyte back every second isn't that bad.
03:56 ๐Ÿ”— SketchCow Especially with all the assholes uploading.
03:59 ๐Ÿ”— underscor haha
03:59 ๐Ÿ”— SketchCow * We are completely uploaded and fine
03:59 ๐Ÿ”— SketchCow < HTTP/1.1 200 Ok
03:59 ๐Ÿ”— underscor iotop is cool
03:59 ๐Ÿ”— SketchCow That's the first of the full sets to upload.
03:59 ๐Ÿ”— SketchCow Let's see if s3 shits the bed and then shits a biiger, more intense bed full of shitted beds.
03:59 ๐Ÿ”— underscor What identifier?
04:00 ๐Ÿ”— Coderjoe shitted-bedception?
04:00 ๐Ÿ”— underscor [item_size] => 200019661
04:00 ๐Ÿ”— underscor hahahahaha
04:00 ๐Ÿ”— SketchCow We have to go DEEPER
04:01 ๐Ÿ”— * underscor waits for disk io to shoot up
04:01 ๐Ÿ”— underscor http://ia600802.us.archive.org/mrtg/diskv3.html
04:01 ๐Ÿ”— Coderjoe aww, how cute. it's only 3 weeks old
04:02 ๐Ÿ”— kennethre w00t 1TiB
04:06 ๐Ÿ”— Coderjoe haha
04:09 ๐Ÿ”— kennethre hahahaha
04:11 ๐Ÿ”— SketchCow OK, so I'm still in deletion city with those files and I am currently concentrating on writing the proposal/information for a documentary I'm being hired to film this summer.
04:11 ๐Ÿ”— SketchCow (A commercial job, straight-fee which I can use to pay off some debts/taxes)
04:12 ๐Ÿ”— SketchCow http://www.us.archive.org/log_show.php?task_id=96126299 is just loving copying over a 198gb file, let me tell you
04:19 ๐Ÿ”— * Aranje wonders aloud where urlte.am went
04:20 ๐Ÿ”— SketchCow A hot item on my list to fix.
04:20 ๐Ÿ”— SketchCow After I get batcave undr control, it's the next thing up.
04:20 ๐Ÿ”— SketchCow Keep on me about it.
04:21 ๐Ÿ”— * Aranje nods
04:21 ๐Ÿ”— SketchCow It's my fault, for not raping dot.fm in the eye
04:25 ๐Ÿ”— underscor SketchCow: hahahahahahahaha
04:25 ๐Ÿ”— underscor That
04:25 ๐Ÿ”— underscor is
04:25 ๐Ÿ”— underscor the
04:25 ๐Ÿ”— underscor best
04:58 ๐Ÿ”— SketchCow http://ia600807.us.archive.org/zipview.php?zip=/29/items/archiveteam-googlegroups-yw/googlegroups-yw.zip&file=
05:33 ๐Ÿ”— SketchCow http://www.archive.org/details/archiveteam-googlegroups-zb&reCache=1
05:36 ๐Ÿ”— SketchCow 100% done by script.
05:39 ๐Ÿ”— arrith very neat
05:40 ๐Ÿ”— arrith automation is the way to go. congrats on more automation
05:50 ๐Ÿ”— SketchCow Gets better, give me a moment.
06:40 ๐Ÿ”— DFJustin so, apparently using edit.php to add files to your item nukes files that came from s3 because it doesn't know about them
06:50 ๐Ÿ”— Coderjoe did you wait until after derive had run on your item?
06:51 ๐Ÿ”— DFJustin ah that could be it
06:58 ๐Ÿ”— SketchCow OK, so.
06:58 ๐Ÿ”— SketchCow I now have a script running those scripts.
06:58 ๐Ÿ”— SketchCow It's in a screen session.
06:58 ๐Ÿ”— kennethre hah, classy
06:58 ๐Ÿ”— SketchCow So while I'm in the air, it'll upload 909gb of google groups.
06:58 ๐Ÿ”— kennethre scriptception
06:59 ๐Ÿ”— SketchCow http://www.archive.org/details/archiveteam-googlegroups
06:59 ๐Ÿ”— SketchCow You will see entries with more than just googlegroups-XX.zip
06:59 ๐Ÿ”— SketchCow That was the old paradigm
07:02 ๐Ÿ”— SketchCow 4.9gb uploaded so far.
07:03 ๐Ÿ”— SketchCow 1305
07:03 ๐Ÿ”— SketchCow root@teamarchive-0:/3/googlegroups# ls | wc -l
07:33 ๐Ÿ”— SketchCow Off it goes.
12:04 ๐Ÿ”— ersi Hm, http://memac.heroku.com isn't showing downloaded users anymore. Even though I'm fetching and reporting users downloaded to the tracker
14:40 ๐Ÿ”— Coderjoe ersi: WORKSFORME WONTFIX
14:53 ๐Ÿ”— ersi Yeah, it does. I commented in #memac ;)
15:57 ๐Ÿ”— Schbirid damn, ran into a 20 pages per day (per paid account) limit on a site
15:57 ๐Ÿ”— Schbirid i want to get ~8000 pages
15:57 ๐Ÿ”— Schbirid gonna need to script that :)
22:47 ๐Ÿ”— Nemo_bis kennethre, is this you? O_o http://ia700000.us.archive.org:8088/mrtg/networkv2.html
22:48 ๐Ÿ”— Nemo_bis space enough for ~4 h of that
23:29 ๐Ÿ”— kennethre ndurner: i'm not running
23:29 ๐Ÿ”— kennethre er, Nemo_bis ^
23:30 ๐Ÿ”— Nemo_bis ok
23:30 ๐Ÿ”— Nemo_bis :)
23:31 ๐Ÿ”— kennethre i wish it was me :)
23:31 ๐Ÿ”— kennethre i want to be at the top of that dashboard :)

irclogger-viewer