Time |
Nickname |
Message |
00:00
🔗
|
db48x |
winr4r: awesome |
00:00
🔗
|
winr4r |
give me a couple of minutes |
00:00
🔗
|
db48x |
deal |
00:04
🔗
|
db48x |
the Miserere Allegro has colonized my brain stem |
00:04
🔗
|
db48x |
even after listening to other music all day, it's running through my mind continuously |
00:05
🔗
|
SketchCow |
Remember, we got shut down constantly, consistently, and threatened, db48x |
00:05
🔗
|
db48x |
yea |
00:05
🔗
|
SketchCow |
So all things considered, it was pretty aggressive suicide on their part. |
00:05
🔗
|
db48x |
also that only includes what you uploaded the other day |
00:05
🔗
|
SketchCow |
But add the archiveteam.org stuff, that will hopefully increase it. |
00:06
🔗
|
db48x |
lots more in the, yea |
00:06
🔗
|
SketchCow |
Yes. I suspect that's the majority of what we got. |
00:06
🔗
|
SketchCow |
This was just stuff I found in uploads people dumped to batcave. |
00:11
🔗
|
winr4r |
http://pastebin.com/NsbryDvm |
00:11
🔗
|
winr4r |
that should work |
00:12
🔗
|
db48x |
oops, that goes the wrong way |
00:12
🔗
|
db48x |
I want to convert the entity reference into the character it references |
00:12
🔗
|
winr4r |
oh. |
00:13
🔗
|
winr4r |
dude i just misread you |
00:13
🔗
|
* |
winr4r headdesk. |
00:13
🔗
|
db48x |
heh, it happens |
00:13
🔗
|
db48x |
unhtml is the right name for the function though :) |
00:13
🔗
|
winr4r |
in my defense, i haven't been awake long. :/ |
00:16
🔗
|
winr4r |
one sec |
00:17
🔗
|
SketchCow |
Setting up this 2 tb of tarring |
00:17
🔗
|
SketchCow |
Then will go the transfers. But first... the tarring! |
00:22
🔗
|
winr4r |
that'll take a while |
00:23
🔗
|
db48x |
winr4r: take your time |
00:23
🔗
|
db48x |
it's a tricky problem |
00:23
🔗
|
winr4r |
i meant what cow's doing |
00:23
🔗
|
db48x |
oh |
00:23
🔗
|
winr4r |
as for me, i need caffeine, so brb |
00:23
🔗
|
db48x |
:) |
00:29
🔗
|
winr4r |
k all better |
00:29
🔗
|
winr4r |
does it need to do ones like Ӓ ? |
00:30
🔗
|
db48x |
unknown |
00:30
🔗
|
db48x |
I guess it should |
00:30
🔗
|
db48x |
let's go with utf-8 output |
00:40
🔗
|
db48x |
not even the Moonlight sonatta can displace the Miserere |
00:41
🔗
|
db48x |
sonata |
00:50
🔗
|
winr4r |
http://pastebin.com/EeUpf4SM |
00:50
🔗
|
winr4r |
that should work |
00:51
🔗
|
winr4r |
<filename.txt whatever.py > newfile.txt |
00:51
🔗
|
db48x |
shiny |
00:51
🔗
|
winr4r |
doesn't account for some joker forgetting the trailing semicolon but fuck 'em |
00:51
🔗
|
db48x |
indeed |
00:53
🔗
|
winr4r |
replace the last line with sys.stdout.write(d) if you don't want a forced line break at the end |
00:53
🔗
|
db48x |
nope, it's perfect |
00:53
🔗
|
SketchCow |
Oh thank goodness, another script to help me. |
00:53
🔗
|
SketchCow |
Now it takes a directory of yahoo videos and names it YV-FIRST-LAST as needed. |
00:54
🔗
|
db48x |
winr4r: superb |
00:54
🔗
|
db48x |
SketchCow: neat |
00:55
🔗
|
db48x |
winr4r: no, you're right. I already had a newline, so printing an extra one is extra |
00:57
🔗
|
SketchCow |
There it goes! 1.6tb of yahoo video being turned into tars. |
00:57
🔗
|
SketchCow |
That'll be a nice add tonight. |
00:58
🔗
|
winr4r |
db48x: different version if you want to force a newline only if it doesn't have one at the end: http://pastebin.com/HcxJDr6a |
00:58
🔗
|
winr4r |
i don't know *why*, but seeing "warning: no newline at end of file" enough times makes you that way |
00:59
🔗
|
winr4r |
SketchCow: sweet |
00:59
🔗
|
db48x |
winr4r: :) |
00:59
🔗
|
db48x |
yea, I got tired of that warning a long time ago |
01:07
🔗
|
db48x |
doh |
01:07
🔗
|
db48x |
it's outputting 0xa9 for © |
01:08
🔗
|
db48x |
but wait, that's right |
01:08
🔗
|
db48x |
so why doesn't it survive when pasted? |
01:10
🔗
|
db48x |
I get a replacement character when I paste :( |
01:10
🔗
|
db48x |
oh well, the output is perfect |
01:11
🔗
|
winr4r |
interesting |
01:11
🔗
|
winr4r |
as in the literal string "0xa9"? |
01:12
🔗
|
db48x |
no, the byte 0xa9 |
01:12
🔗
|
db48x |
I was expecting a multi-byte sequence |
01:12
🔗
|
winr4r |
ah |
01:12
🔗
|
winr4r |
dunno |
01:14
🔗
|
db48x |
oh, that is wrong |
01:14
🔗
|
db48x |
it should be a multibyte sequence |
01:15
🔗
|
db48x |
ahh |
01:15
🔗
|
db48x |
http://docs.python.org/release/2.3/lib/module-htmlentitydefs.html |
01:15
🔗
|
db48x |
it maps from entities to ISO-8859-1 |
01:15
🔗
|
winr4r |
oh |
01:15
🔗
|
winr4r |
silly me |
01:15
🔗
|
db48x |
no, silly python |
01:16
🔗
|
winr4r |
no, that was me using the wrong thing, entitydefs rather than name2codepoint |
01:16
🔗
|
db48x |
entitydefs is such a braindead thing to include |
01:17
🔗
|
db48x |
it shouldn't even be possible to make that error |
01:17
🔗
|
winr4r |
heh |
01:17
🔗
|
db48x |
File "/home/db48x/archives/lulupoetry/unified/unhtml.py", line 10, in unhtml |
01:18
🔗
|
db48x |
TypeError: expected a character buffer object |
01:18
🔗
|
db48x |
s = s.replace("&%s;" % x, name2codepoint[x]) |
01:18
🔗
|
winr4r |
that should be unichr(name2codepoint[x]) |
01:18
🔗
|
winr4r |
one second |
01:19
🔗
|
db48x |
ah, heh |
01:19
🔗
|
winr4r |
(you'll get an error if you do that, you need to set the default encoding first) |
01:19
🔗
|
db48x |
indeed I do |
01:20
🔗
|
winr4r |
http://pastebin.com/yr2rnzZV |
01:20
🔗
|
winr4r |
try that, sorry |
01:21
🔗
|
db48x |
perfect |
01:21
🔗
|
winr4r |
reload(sys) because site.py actually deletes the "setdefaultencoding" function from sys, for reasons that probably make sense to python developers |
01:21
🔗
|
db48x |
http://pastebin.com/nihmctj4 |
01:22
🔗
|
db48x |
heh |
01:22
🔗
|
winr4r |
are those backslashes meant to be there? |
01:22
🔗
|
db48x |
they're in the html |
01:22
🔗
|
winr4r |
k fine, that's not something i screwed up then ;D |
01:22
🔗
|
db48x |
yea :) |
01:23
🔗
|
winr4r |
so you're making a text dump of lulu poetry? |
01:23
🔗
|
db48x |
yep |
01:23
🔗
|
winr4r |
nice :) |
01:23
🔗
|
db48x |
the html has so much gorp |
01:25
🔗
|
winr4r |
mhm |
02:11
🔗
|
SketchCow |
|
02:13
🔗
|
winr4r |
took the words out of my mouth |
02:23
🔗
|
BlueMax |
Yeah, suck those words |
02:40
🔗
|
SketchCow |
OK, plane is landing. |
02:40
🔗
|
SketchCow |
Or crashing |
02:40
🔗
|
SketchCow |
They never really tell you. |
02:40
🔗
|
SketchCow |
Either way, on the ground in 20. |
02:49
🔗
|
winr4r |
i didn't even know they had internet on planes, so i figured you had already landed |
02:49
🔗
|
winr4r |
welcome to the future |
02:52
🔗
|
closure |
SketchCow: dude, welcome to SF |
02:52
🔗
|
* |
closure is hanging at noisebridge |
02:52
🔗
|
chronomex |
sf, home of internet |
02:53
🔗
|
BlueMax |
sf, home of cisco |
02:53
🔗
|
* |
BlueMax is the state the obvious guy |
02:53
🔗
|
chronomex |
sf, home of many hobos |
02:56
🔗
|
BlueMax |
and a temporary home to SketchCow |
02:56
🔗
|
BlueMax |
(hobo?) |
02:56
🔗
|
BlueMax |
:P |
03:34
🔗
|
noob_ |
Hi does any one have any suggestion for backing up the google buzz / google reader data people have shared with me? |
03:50
🔗
|
lemonkey |
sketchcow is sleeping at noisebridge? |
10:37
🔗
|
efnu7z |
I own this network, #hackers 4 skillz & #trustnetwork 4 shellz |
10:38
🔗
|
efnu7z |
I own this network, #hackers 4 skillz & #trustnetwork 4 shellz |
10:39
🔗
|
chronomex |
uhhuh. |
10:41
🔗
|
efnu7z |
I own this network, #hackers 4 skillz & #trustnetwork 4 shellz |
10:42
🔗
|
efnu7z |
I own this network, #hackers 4 skillz & #trustnetwork 4 shellz |
10:43
🔗
|
chronomex |
let's see if that stops it |
16:23
🔗
|
winr4r |
SketchCow: are you still thinking of doing a hangout later? |