#urlteam 2012-03-31,Sat

↑back Search

Time Nickname Message
01:34 🔗 chronomex hahaha, that was very weird to see in /away-log
01:35 🔗 chronomex I'd add examples to the spec
01:35 🔗 chronomex also en_us collation is stupid, why not a naive codepoint collation
20:41 🔗 soultcer SketchCow: It's true. Without chronomex going ahead and just writing a small script and saving shorteners I would probably still sit at my desk and try to design some XML scheme for storing shorturls ;-)
20:43 🔗 soultcer chronomex: I am not sure if addressing encoding is the right thing to do. We could just treat all URLs as binary data and leave it to whoever uses our releases to figure out which shortener used which encoding
20:43 🔗 chronomex bytes traverse the network
20:43 🔗 chronomex we record bytes
20:44 🔗 chronomex I don't see how encodings figure into this
20:44 🔗 soultcer And I have to admit, I don't know a lot about Unicode. It took me a while to figure out the whole collation thing.
20:45 🔗 soultcer tinyarrows.com previously used Unicode characters as shortcodes (they seem to have stopped, probably because it was a dumb idea)
20:45 🔗 soultcer But then again, the http header doesn't have any character encoding
20:47 🔗 soultcer Error pages like http://is.gd/mBAh however do
20:48 🔗 chronomex seriously man if you're being this picky, just write a goddamn warc and be done wit it
20:49 🔗 soultcer Sounds like a neat idea, though it would probably eat a lot of storage
20:51 🔗 soultcer Or we stick to the old format and simply ignore those (very, very, very, very) few URLs that contain a newline
21:17 🔗 chronomex storage be damned, ia would appreciate it

irclogger-viewer