| Time |
Nickname |
Message |
|
07:36
🔗
|
soultcer |
GLaDOS: I'm afraid I don't have a http server with enough diskspace for that |
|
07:41
🔗
|
soultcer |
GLaDOS, I think there is a problem with the tracker |
|
07:41
🔗
|
soultcer |
The tracker makes sure to only hand out one task for each service to the same IP, so that the scrapers don't get blocked |
|
07:41
🔗
|
soultcer |
But your tracker only hands out one task for each service, for all IPs |
|
07:41
🔗
|
omf_ |
soultcer, is the data available anywhere else online besides bit torrent? |
|
07:42
🔗
|
soultcer |
I think the tracker does not get the real IP of the http request sender, but the IP of your reverse proxy |
|
07:44
🔗
|
soultcer |
omf_, not that I know of. But the torrent should be seeded, according to the trackers |
|
07:45
🔗
|
soultcer |
I'll try to add another seed |
|
07:49
🔗
|
omf_ |
So 80gb for everything |
|
07:49
🔗
|
GLaDOS |
Ah, derp, my mistake |
|
07:49
🔗
|
GLaDOS |
How would I make apache pass the real IP? |
|
07:53
🔗
|
GLaDOS |
Also, soultcer, sending it on an external HDD would be fine. |
|
07:53
🔗
|
GLaDOS |
Actually, I have a perfect external for this |
|
07:53
🔗
|
soultcer |
I think web.py uses the X-Forwarded-For header |
|
07:54
🔗
|
soultcer |
Using a reverse proxy before the tracker is good though, because then the front page with the stats (which is expensive to generate) can be cached |
|
08:07
🔗
|
GLaDOS |
ugh |
|
08:07
🔗
|
GLaDOS |
not working |
|
08:09
🔗
|
soultcer |
stupid nat and only barely working trackers |
|
08:10
🔗
|
GLaDOS |
I'm going to try using varnish instead |
|
08:10
🔗
|
soultcer |
Whatever works for you. I'm a huge nginx fan myself |
|
08:11
🔗
|
GLaDOS |
Eh, I've yet to get into nginx |
|
08:12
🔗
|
soultcer |
Ok, now I have found two peers for the torrent, both seeders |
|
08:18
🔗
|
GLaDOS |
Ok, I'll try launching it again |
|
08:41
🔗
|
omf_ |
well? |
|
08:43
🔗
|
GLaDOS |
Seems like varnish fixed it. |
|
08:43
🔗
|
GLaDOS |
All 6 of my scrapers got jobs |
|
08:44
🔗
|
soultcer |
Yeah, each of them got one job for a different service |
|
08:44
🔗
|
soultcer |
The problem still exists |
|
08:45
🔗
|
GLaDOS |
Yeah, nevermind. |
|
08:50
🔗
|
GLaDOS |
Yeah, web.ctx.ip is supposed to be returning the x-forwarded-for ip |
|
08:51
🔗
|
GLaDOS |
127.0.0.1:54662 - - [02/Oct/2013 10:53:07] "HTTP/1.1 GET /task/get" - 200 OK |
|
08:51
🔗
|
GLaDOS |
Tracker only sees 127.0.0.1 though: |
|
09:14
🔗
|
GLaDOS |
soultcer: do you suggest replacing web.ctx.ip with this function: http://code.activestate.com/recipes/577795-get-users-ip-address-even-when-theyre-behind-a-pro/ |
|
09:15
🔗
|
GLaDOS |
Or hell, just web.ctx.env['HTTP_X_FORWARDED_FOR'] |
|
09:27
🔗
|
GLaDOS |
funfact: disabling cache for /task/(.*) might help |
|
09:27
🔗
|
GLaDOS |
OH GOD IP ADDRESS IS NOW NULL |
|
09:28
🔗
|
GLaDOS |
(didn't change code btw) |
|
09:48
🔗
|
GLaDOS |
I think it's handing out requests properly now |
|
09:50
🔗
|
GLaDOS |
IT IS |
|
09:50
🔗
|
GLaDOS |
soultcer: I FIXED IT |
|
09:50
🔗
|
GLaDOS |
Turns out varnish wasn't passing X-FORWARDED-FOR at times |
|
09:50
🔗
|
GLaDOS |
Forced it to do that, and not keep connection open |
|
09:59
🔗
|
Cameron_D |
/data is returning nothing now :( |
|
10:01
🔗
|
GLaDOS |
what |
|
10:02
🔗
|
GLaDOS |
Cameron_D: did you mean: http://urlteam.terrywri.st/data/ |
|
10:02
🔗
|
GLaDOS |
the end slash matters, apparently |
|
10:04
🔗
|
Cameron_D |
yeah, blank here |
|
10:05
🔗
|
GLaDOS |
.. |
|
10:06
🔗
|
GLaDOS |
It returns for me |
|
10:09
🔗
|
Cameron_D |
its working on my phone, so maybe my proxy at home had cached nothing |
|
10:09
🔗
|
GLaDOS |
probably. |
|
10:10
🔗
|
ersi |
/data and /data/ are technically two very different things. |
|
10:11
🔗
|
omf_ |
http://urlteam.terrywri.st/data/ works for me |
|
10:12
🔗
|
ersi |
It takes a few moments, but works for me as well. |
|
11:17
🔗
|
soultcer |
You are only allowing one concurrent access to the backend, right? |
|
11:29
🔗
|
GLaDOS |
For everything but /task |
|
11:29
🔗
|
GLaDOS |
If I did that with /task, the same issue would happen, just with the first connecting IP |
|
11:39
🔗
|
soultcer |
huh? |
|
11:39
🔗
|
soultcer |
sqlite3 doesn't support locks and the task assignment is a two-step process: a) Select a suitable task b) assign task to user |
|
11:40
🔗
|
GLaDOS |
I know, but it can't be a continuous connection if we want to keep sending the x-forwarded-for connection |
|
11:40
🔗
|
ersi |
ie: at least one read and one write |
|
11:40
🔗
|
GLaDOS |
At least that's what I read |
|
11:42
🔗
|
soultcer |
HTTP is stateless. You can send whatever headers you want with each request. |
|
11:42
🔗
|
soultcer |
The problem is, there should not be more than one write access on the db. sqlite supports transactions and so on, so it won't break or corrupt the database |
|
11:42
🔗
|
soultcer |
But it could accidentally hand out the same task twice |
|
11:48
🔗
|
GLaDOS |
Okay, removed the connection closing. |
|
12:35
🔗
|
soultcer |
When you get the time you should just throw the whole tracker thing away and rewrite it to actually work without any hacks ;-) |
|
12:37
🔗
|
GLaDOS |
That's implying that I can code for shit. |
|
12:37
🔗
|
GLaDOS |
Might do it as a school project |
|
12:37
🔗
|
GLaDOS |
"And why did you do this?" "Because URL shorteners were a fucking awful idea." |
|
13:02
🔗
|
ersi |
"See this short link" |
|
13:03
🔗
|
ersi |
-"It.. doesn't work." |
|
13:03
🔗
|
ersi |
"Point taken yet?" |
|
13:05
🔗
|
GLaDOS |
Put all my references as short urls on a about to die URL shortener |
|
22:58
🔗
|
SketchCow |
Do we need anything here? |
|
22:58
🔗
|
SketchCow |
Tracker is not workikng? |