Time |
Nickname |
Message |
07:10
🔗
|
Knuckx |
Does the warrior work in VMware ESXi, and if so, what do I need to change in the OVF to make it load? |
17:54
🔗
|
aaaaaaaaa |
The tracker has been taking a hit lately. Would it be possible for the tracker to have an overloaded signal that causes a longer sleep time or have an exponential backoff on repeated rate limiting? |
17:55
🔗
|
aaaaaaaaa |
maybe where i is number of rate limits in a row do sleep i^2+random? |
17:58
🔗
|
aaaaaaaaa |
with a cap at 5 minutes + random, or so? |
18:51
🔗
|
yipdw |
aaaaaaaaa: possibly, but then I think people would just remove it from the code |
18:51
🔗
|
yipdw |
we really aren't actually falling over that badly; the 502 rate is up but the tracker still works |
18:53
🔗
|
aaaaaaaaa |
ok, Its just oomed a few times and was thrasing badly for awhile |
18:54
🔗
|
aaaaaaaaa |
so just an idea. |
18:54
🔗
|
yipdw |
the OOM is actually Redis |
18:54
🔗
|
yipdw |
when people load too many items into the tracker + there's a lot of stuff leftover in done-tracking sets |
18:55
🔗
|
yipdw |
rate-limiting the warrior wouldn't address that issue -- for that we probably need to fix up our done-set draining scripts so that e.g. they run with proper supervision |
18:55
🔗
|
aaaaaaaaa |
oh, and redis is completely in memory. |
18:55
🔗
|
yipdw |
yes |
18:55
🔗
|
aaaaaaaaa |
ok |
18:56
🔗
|
aaaaaaaaa |
you'd know better than me, but I thought the two were related. |
18:56
🔗
|
yipdw |
fack |
18:56
🔗
|
yipdw |
used_memory:778689520 |
18:56
🔗
|
yipdw |
used_memory_human:742.62M |
18:56
🔗
|
aaaaaaaaa |
so it is thrashing? |
18:56
🔗
|
yipdw |
aaaaaaaaa: sort of -- the more requests, the more pressure on the Ruby app, and the more memory it uses |
18:57
🔗
|
yipdw |
but the biggest consumer of memory is definitely stuff in Redis |
18:57
🔗
|
lhobas |
whats the stack like? |
18:57
🔗
|
yipdw |
the C stack isn't 742 MB deep |
18:57
🔗
|
yipdw |
it's jemalloc-allocated heap |
18:57
🔗
|
yipdw |
oh, did you mean software stack |
18:58
🔗
|
lhobas |
I did, could have been more specific, sorry ;) |
18:58
🔗
|
yipdw |
the tracker is a Ruby app, written using Sinatra, and deployed using Passenger+nginx |
18:58
🔗
|
yipdw |
datastore is Redis 2.6.something |
18:59
🔗
|
lhobas |
my goto stack, not much experience with redis though |
18:59
🔗
|
yipdw |
Passenger is configured to kill workers after they've handled 10000 requests |
18:59
🔗
|
yipdw |
so even if we have leaks in the workers (doesn't seem like any major ones) they can't go too crazy |
18:59
🔗
|
chfoo |
i still have the tracker i used for puush if we need it |
18:59
🔗
|
yipdw |
I should remove that kill option |
19:00
🔗
|
yipdw |
it just causes intermittent timeouts |
19:01
🔗
|
lhobas |
you mean passenger killing workers results in delays while spinning up new ones? |
19:01
🔗
|
yipdw |
yeah, if no other workers are available to service requests |
19:01
🔗
|
yipdw |
when we hit high load that happens a lot |
19:01
🔗
|
lhobas |
right, if there are no significant leaks, it might improve things a little |
19:03
🔗
|
lhobas |
what kind of hardware is the tracker running on? (just curious, don't want to waste your time) |
19:04
🔗
|
aaaaaaaaa |
The tracker runs on a Linode 1 GB instance according to the wiki |
19:07
🔗
|
yipdw |
it has 2 GB now, but yeah |
19:07
🔗
|
yipdw |
still at linode |
19:15
🔗
|
lhobas |
quite amazed by the throughput tbh :) |