[18:26] so if I'm a "researcher" .. any way to quickly convert between a short url and full url? without download a huge torrent/file or part of it ... (silly question) [19:48] is the shortener still online? [19:57] well...was more of a theoretical question :) [19:57] let's say it isn't [20:41] Well then you have to download the torrent [20:55] We don't have an online resolver right now [20:55] I was thinking about it [20:59] 500 GB diskspace if you use b-tree indexes [21:00] Better 750 gb to 1000 gb so we can expand [21:45] There are some unshorteners out there, though I don't think they keep a permanent archive [22:17] hmm [22:17] right, http://is.gd.urlte.am/RSta7r [22:18] or http://urlte.am/is.gd/RSta7r [22:18] probably could make all of these acceptable [22:41] But how do we handle spam? [22:46] what do you mean? [22:49] Well, assuming is.gd blocked a website for spam, tinyback will still return the original URL. [22:50] e.g. http://is.gd/mBAh [22:50] So urlte.am/is.gd/mBAh will link to a spam site [22:58] ah [22:58] well, ok [22:59] But a nice server for urlteam data would be awesome [23:00] The db currently lives on an 3 year old external usb drive :D [23:03] how big is our dataset? [23:03] 426GB as of this moment with indexes [23:03] hmm [23:03] stored in what? [23:04] BerkeleyDB with b-trees [23:04] that def wouldn't fit on a vps, but I could host it from my house or something [23:04] * chronomex investigates storage array systems [23:05] I intend to start a drive-of-the-month program, which means I'll need to have a drive cage [23:05] drive-of-the-month? [23:05] yeah you buy the cheapest 2T+ drive on newegg or whatever [23:05] every month [23:06] But why? [23:06] thus ensuring that you mix batches well, and you have a continuous sustained space increase [23:06] I like to save things [23:07] don't we all? [23:07] I try to only store things I need [23:08] well ok [23:11] Will you buy a second drive each month for backup? [23:28] no, more of a M-copies type of filesystem thing and just keep adding drives