#archiveteam 2011-10-22,Sat

↑back Search

Time Nickname Message
00:00 🔗 db48x winr4r: awesome
00:00 🔗 winr4r give me a couple of minutes
00:00 🔗 db48x deal
00:04 🔗 db48x the Miserere Allegro has colonized my brain stem
00:04 🔗 db48x even after listening to other music all day, it's running through my mind continuously
00:05 🔗 SketchCow Remember, we got shut down constantly, consistently, and threatened, db48x
00:05 🔗 db48x yea
00:05 🔗 SketchCow So all things considered, it was pretty aggressive suicide on their part.
00:05 🔗 db48x also that only includes what you uploaded the other day
00:05 🔗 SketchCow But add the archiveteam.org stuff, that will hopefully increase it.
00:06 🔗 db48x lots more in the, yea
00:06 🔗 SketchCow Yes. I suspect that's the majority of what we got.
00:06 🔗 SketchCow This was just stuff I found in uploads people dumped to batcave.
00:11 🔗 winr4r http://pastebin.com/NsbryDvm
00:11 🔗 winr4r that should work
00:12 🔗 db48x oops, that goes the wrong way
00:12 🔗 db48x I want to convert the entity reference into the character it references
00:12 🔗 winr4r oh.
00:13 🔗 winr4r dude i just misread you
00:13 🔗 * winr4r headdesk.
00:13 🔗 db48x heh, it happens
00:13 🔗 db48x unhtml is the right name for the function though :)
00:13 🔗 winr4r in my defense, i haven't been awake long. :/
00:16 🔗 winr4r one sec
00:17 🔗 SketchCow Setting up this 2 tb of tarring
00:17 🔗 SketchCow Then will go the transfers. But first... the tarring!
00:22 🔗 winr4r that'll take a while
00:23 🔗 db48x winr4r: take your time
00:23 🔗 db48x it's a tricky problem
00:23 🔗 winr4r i meant what cow's doing
00:23 🔗 db48x oh
00:23 🔗 winr4r as for me, i need caffeine, so brb
00:23 🔗 db48x :)
00:29 🔗 winr4r k all better
00:29 🔗 winr4r does it need to do ones like Ӓ ?
00:30 🔗 db48x unknown
00:30 🔗 db48x I guess it should
00:30 🔗 db48x let's go with utf-8 output
00:40 🔗 db48x not even the Moonlight sonatta can displace the Miserere
00:41 🔗 db48x sonata
00:50 🔗 winr4r http://pastebin.com/EeUpf4SM
00:50 🔗 winr4r that should work
00:51 🔗 winr4r <filename.txt whatever.py > newfile.txt
00:51 🔗 db48x shiny
00:51 🔗 winr4r doesn't account for some joker forgetting the trailing semicolon but fuck 'em
00:51 🔗 db48x indeed
00:53 🔗 winr4r replace the last line with sys.stdout.write(d) if you don't want a forced line break at the end
00:53 🔗 db48x nope, it's perfect
00:53 🔗 SketchCow Oh thank goodness, another script to help me.
00:53 🔗 SketchCow Now it takes a directory of yahoo videos and names it YV-FIRST-LAST as needed.
00:54 🔗 db48x winr4r: superb
00:54 🔗 db48x SketchCow: neat
00:55 🔗 db48x winr4r: no, you're right. I already had a newline, so printing an extra one is extra
00:57 🔗 SketchCow There it goes! 1.6tb of yahoo video being turned into tars.
00:57 🔗 SketchCow That'll be a nice add tonight.
00:58 🔗 winr4r db48x: different version if you want to force a newline only if it doesn't have one at the end: http://pastebin.com/HcxJDr6a
00:58 🔗 winr4r i don't know *why*, but seeing "warning: no newline at end of file" enough times makes you that way
00:59 🔗 winr4r SketchCow: sweet
00:59 🔗 db48x winr4r: :)
00:59 🔗 db48x yea, I got tired of that warning a long time ago
01:07 🔗 db48x doh
01:07 🔗 db48x it's outputting 0xa9 for &copy;
01:08 🔗 db48x but wait, that's right
01:08 🔗 db48x so why doesn't it survive when pasted?
01:10 🔗 db48x I get a replacement character when I paste :(
01:10 🔗 db48x oh well, the output is perfect
01:11 🔗 winr4r interesting
01:11 🔗 winr4r as in the literal string "0xa9"?
01:12 🔗 db48x no, the byte 0xa9
01:12 🔗 db48x I was expecting a multi-byte sequence
01:12 🔗 winr4r ah
01:12 🔗 winr4r dunno
01:14 🔗 db48x oh, that is wrong
01:14 🔗 db48x it should be a multibyte sequence
01:15 🔗 db48x ahh
01:15 🔗 db48x http://docs.python.org/release/2.3/lib/module-htmlentitydefs.html
01:15 🔗 db48x it maps from entities to ISO-8859-1
01:15 🔗 winr4r oh
01:15 🔗 winr4r silly me
01:15 🔗 db48x no, silly python
01:16 🔗 winr4r no, that was me using the wrong thing, entitydefs rather than name2codepoint
01:16 🔗 db48x entitydefs is such a braindead thing to include
01:17 🔗 db48x it shouldn't even be possible to make that error
01:17 🔗 winr4r heh
01:17 🔗 db48x File "/home/db48x/archives/lulupoetry/unified/unhtml.py", line 10, in unhtml
01:18 🔗 db48x TypeError: expected a character buffer object
01:18 🔗 db48x s = s.replace("&%s;" % x, name2codepoint[x])
01:18 🔗 winr4r that should be unichr(name2codepoint[x])
01:18 🔗 winr4r one second
01:19 🔗 db48x ah, heh
01:19 🔗 winr4r (you'll get an error if you do that, you need to set the default encoding first)
01:19 🔗 db48x indeed I do
01:20 🔗 winr4r http://pastebin.com/yr2rnzZV
01:20 🔗 winr4r try that, sorry
01:21 🔗 db48x perfect
01:21 🔗 winr4r reload(sys) because site.py actually deletes the "setdefaultencoding" function from sys, for reasons that probably make sense to python developers
01:21 🔗 db48x http://pastebin.com/nihmctj4
01:22 🔗 db48x heh
01:22 🔗 winr4r are those backslashes meant to be there?
01:22 🔗 db48x they're in the html
01:22 🔗 winr4r k fine, that's not something i screwed up then ;D
01:22 🔗 db48x yea :)
01:23 🔗 winr4r so you're making a text dump of lulu poetry?
01:23 🔗 db48x yep
01:23 🔗 winr4r nice :)
01:23 🔗 db48x the html has so much gorp
01:25 🔗 winr4r mhm
02:11 🔗 SketchCow
02:13 🔗 winr4r took the words out of my mouth
02:23 🔗 BlueMax Yeah, suck those words
02:40 🔗 SketchCow OK, plane is landing.
02:40 🔗 SketchCow Or crashing
02:40 🔗 SketchCow They never really tell you.
02:40 🔗 SketchCow Either way, on the ground in 20.
02:49 🔗 winr4r i didn't even know they had internet on planes, so i figured you had already landed
02:49 🔗 winr4r welcome to the future
02:52 🔗 closure SketchCow: dude, welcome to SF
02:52 🔗 * closure is hanging at noisebridge
02:52 🔗 chronomex sf, home of internet
02:53 🔗 BlueMax sf, home of cisco
02:53 🔗 * BlueMax is the state the obvious guy
02:53 🔗 chronomex sf, home of many hobos
02:56 🔗 BlueMax and a temporary home to SketchCow
02:56 🔗 BlueMax (hobo?)
02:56 🔗 BlueMax :P
03:34 🔗 noob_ Hi does any one have any suggestion for backing up the google buzz / google reader data people have shared with me?
03:50 🔗 lemonkey sketchcow is sleeping at noisebridge?
10:37 🔗 efnu7z I own this network, #hackers 4 skillz & #trustnetwork 4 shellz
10:38 🔗 efnu7z I own this network, #hackers 4 skillz & #trustnetwork 4 shellz
10:39 🔗 chronomex uhhuh.
10:41 🔗 efnu7z I own this network, #hackers 4 skillz & #trustnetwork 4 shellz
10:42 🔗 efnu7z I own this network, #hackers 4 skillz & #trustnetwork 4 shellz
10:43 🔗 chronomex let's see if that stops it
16:23 🔗 winr4r SketchCow: are you still thinking of doing a hangout later?

irclogger-viewer