[07:36] GLaDOS: I'm afraid I don't have a http server with enough diskspace for that [07:41] GLaDOS, I think there is a problem with the tracker [07:41] The tracker makes sure to only hand out one task for each service to the same IP, so that the scrapers don't get blocked [07:41] But your tracker only hands out one task for each service, for all IPs [07:41] soultcer, is the data available anywhere else online besides bit torrent? [07:42] I think the tracker does not get the real IP of the http request sender, but the IP of your reverse proxy [07:44] omf_, not that I know of. But the torrent should be seeded, according to the trackers [07:45] I'll try to add another seed [07:49] So 80gb for everything [07:49] Ah, derp, my mistake [07:49] How would I make apache pass the real IP? [07:53] Also, soultcer, sending it on an external HDD would be fine. [07:53] Actually, I have a perfect external for this [07:53] I think web.py uses the X-Forwarded-For header [07:54] Using a reverse proxy before the tracker is good though, because then the front page with the stats (which is expensive to generate) can be cached [08:07] ugh [08:07] not working [08:09] stupid nat and only barely working trackers [08:10] I'm going to try using varnish instead [08:10] Whatever works for you. I'm a huge nginx fan myself [08:11] Eh, I've yet to get into nginx [08:12] Ok, now I have found two peers for the torrent, both seeders [08:18] Ok, I'll try launching it again [08:41] well? [08:43] Seems like varnish fixed it. [08:43] All 6 of my scrapers got jobs [08:44] Yeah, each of them got one job for a different service [08:44] The problem still exists [08:45] Yeah, nevermind. [08:50] Yeah, web.ctx.ip is supposed to be returning the x-forwarded-for ip [08:51] 127.0.0.1:54662 - - [02/Oct/2013 10:53:07] "HTTP/1.1 GET /task/get" - 200 OK [08:51] Tracker only sees 127.0.0.1 though: [09:14] soultcer: do you suggest replacing web.ctx.ip with this function: http://code.activestate.com/recipes/577795-get-users-ip-address-even-when-theyre-behind-a-pro/ [09:15] Or hell, just web.ctx.env['HTTP_X_FORWARDED_FOR'] [09:27] funfact: disabling cache for /task/(.*) might help [09:27] OH GOD IP ADDRESS IS NOW NULL [09:28] (didn't change code btw) [09:48] I think it's handing out requests properly now [09:50] IT IS [09:50] soultcer: I FIXED IT [09:50] Turns out varnish wasn't passing X-FORWARDED-FOR at times [09:50] Forced it to do that, and not keep connection open [09:59] /data is returning nothing now :( [10:01] what [10:02] Cameron_D: did you mean: http://urlteam.terrywri.st/data/ [10:02] the end slash matters, apparently [10:04] yeah, blank here [10:05] .. [10:06] It returns for me [10:09] its working on my phone, so maybe my proxy at home had cached nothing [10:09] probably. [10:10] /data and /data/ are technically two very different things. [10:11] http://urlteam.terrywri.st/data/ works for me [10:12] It takes a few moments, but works for me as well. [11:17] You are only allowing one concurrent access to the backend, right? [11:29] For everything but /task [11:29] If I did that with /task, the same issue would happen, just with the first connecting IP [11:39] huh? [11:39] sqlite3 doesn't support locks and the task assignment is a two-step process: a) Select a suitable task b) assign task to user [11:40] I know, but it can't be a continuous connection if we want to keep sending the x-forwarded-for connection [11:40] ie: at least one read and one write [11:40] At least that's what I read [11:42] HTTP is stateless. You can send whatever headers you want with each request. [11:42] The problem is, there should not be more than one write access on the db. sqlite supports transactions and so on, so it won't break or corrupt the database [11:42] But it could accidentally hand out the same task twice [11:48] Okay, removed the connection closing. [12:35] When you get the time you should just throw the whole tracker thing away and rewrite it to actually work without any hacks ;-) [12:37] That's implying that I can code for shit. [12:37] Might do it as a school project [12:37] "And why did you do this?" "Because URL shorteners were a fucking awful idea." [13:02] "See this short link" [13:03] -"It.. doesn't work." [13:03] "Point taken yet?" [13:05] Put all my references as short urls on a about to die URL shortener [22:58] Do we need anything here? [22:58] Tracker is not workikng?