[01:34] hahaha, that was very weird to see in /away-log [01:35] I'd add examples to the spec [01:35] also en_us collation is stupid, why not a naive codepoint collation [20:41] SketchCow: It's true. Without chronomex going ahead and just writing a small script and saving shorteners I would probably still sit at my desk and try to design some XML scheme for storing shorturls ;-) [20:43] chronomex: I am not sure if addressing encoding is the right thing to do. We could just treat all URLs as binary data and leave it to whoever uses our releases to figure out which shortener used which encoding [20:43] bytes traverse the network [20:43] we record bytes [20:44] I don't see how encodings figure into this [20:44] And I have to admit, I don't know a lot about Unicode. It took me a while to figure out the whole collation thing. [20:45] tinyarrows.com previously used Unicode characters as shortcodes (they seem to have stopped, probably because it was a dumb idea) [20:45] But then again, the http header doesn't have any character encoding [20:47] Error pages like http://is.gd/mBAh however do [20:48] seriously man if you're being this picky, just write a goddamn warc and be done wit it [20:49] Sounds like a neat idea, though it would probably eat a lot of storage [20:51] Or we stick to the old format and simply ignore those (very, very, very, very) few URLs that contain a newline [21:17] storage be damned, ia would appreciate it