Time |
Nickname |
Message |
00:00
🔗
|
JAA |
run* |
00:54
🔗
|
Kaz |
Just outgoing |
07:05
🔗
|
|
kbtoo_ has quit IRC (Read error: Operation timed out) |
08:55
🔗
|
|
kbtoo has joined #warrior |
09:12
🔗
|
_folti_ |
you only need open ports to access it's web interface |
09:13
🔗
|
_folti_ |
(which is unprotected, so you should think twice to open it to the bit bad internet) |
11:46
🔗
|
|
Gallifrey has joined #warrior |
11:47
🔗
|
Gallifrey |
I'm trying to increase my concurrency and I'm wondering which is the more efficient arrangement: 2 Warriors (on the same machine) doing 2 items each, or 1 single Warrior doing 4 items? |
12:07
🔗
|
jut |
One vm with more concurrency |
12:15
🔗
|
Gallifrey |
Thanks |
13:26
🔗
|
|
kbtoo has quit IRC (Read error: Connection reset by peer) |
13:29
🔗
|
|
kbtoo has joined #warrior |
15:13
🔗
|
|
warmwaffl has joined #warrior |
16:27
🔗
|
chfoo |
i tried troubleshooting the tracker ruby application to see why it stalls. i have watch passenger-status running and the requests in queue is 0, but then after a while it accumulates to 500 in queue with none of the processes processed counter go up. then after a while, the queue goes back to 0. redis slowlog only shows the tracker log drainer that doesn't seem to be blocking anything |
16:28
🔗
|
|
JAA sets mode: +oo chfoo Kaz |
16:29
🔗
|
kiska |
I see... |
16:34
🔗
|
chfoo |
i also noticed is that client connections disconnect while waiting in the queue, which might indicate slow clients. unfortunately, i don't see a free (not enterprise option) request timeout option. and even if it existed, i don't want to risk breaking it because it would need to only apply to the seesaw api and not the admin urls |
16:36
🔗
|
kiska |
I am looking at the telegraf graphs for the tracker server, and I am seeing spikes of CPU usage upto 100% and stay there for about a couple mins, then drop down to 25%. I am guessing that is the result of the stalling, and passenger might be panicing? |
16:36
🔗
|
kiska |
panicking* |
16:37
🔗
|
chfoo |
i did notice that correlation. i would have assumed cpu usage would go down since it would have nothing to do |
16:38
🔗
|
kiska |
I am guessing its happening again, since its now at 86% and rising |
16:38
🔗
|
kiska |
Is there a profiler that you can attach? |
16:39
🔗
|
chfoo |
not that i'm aware of |
16:40
🔗
|
kiska |
I am guessing you just can't use this tool? https://github.com/MiniProfiler/rack-mini-profiler |
16:43
🔗
|
chfoo |
"Ruby 2.3+" i'm not sure if the tracker is compatible |
16:43
🔗
|
kiska |
Yeah, that was what I feared |
16:57
🔗
|
chfoo |
one thing in the past was that the worker process memory would bloat to 1 or 2GB so max_requests was set to 4000 to force it to restart. i have it set to 8000 right now |
16:57
🔗
|
kiska |
Is that still a problem? |
17:00
🔗
|
JAA |
It looks like memory fluctuates by over a GB used, and CPU usage drops when it reaches a peak. |
17:02
🔗
|
JAA |
See e.g. https://atdash.meo.ws/d/000000058/telegraf-detail?orgId=1&var-user=astrid&var-host=xn--zoty-01a&from=1553789236525&to=1553789861131 |
17:08
🔗
|
Kaz |
I think one thing to raise is: do we care about 'slow connections vs no connections'. As it stands, once that queue hits 500, you get a barrage of things retrying |
17:08
🔗
|
Kaz |
Would we prefer just letting them sit in a queue for longer, and respond once passenger stops hanging |
17:09
🔗
|
Kaz |
that's by no means a *solution*, don't get me wrong |
17:11
🔗
|
marked |
if bigger hardware fixes all ills, we should be able to dpulicate the sofwtare environment with just the package list on the current machine |
17:12
🔗
|
|
Medowar has joined #warrior |
17:13
🔗
|
Kaz |
I haven't actually checked, but does redis actually reply to queries when the queue is full? I wonder if they're just all waiting for info |
17:27
🔗
|
marked |
does yipdw ever come by? one of the patches for the tracker by them looks useful |
17:38
🔗
|
chfoo |
i had redis --latency running for a while and it didn't stop when the queue gets full |
17:38
🔗
|
|
Dj-Wawa has quit IRC (Quit: Connection closed for inactivity) |
17:40
🔗
|
Kaz |
interestingly not all the threads are locking at the same time, i just watched 3 of them lock up, then finally the 4th locked and queue bloated |
18:45
🔗
|
|
Gallifrey has quit IRC (Quit: http://www.mibbit.com ajax IRC Client) |
20:08
🔗
|
|
robbierut has joined #warrior |
20:10
🔗
|
|
rnduser_ has joined #warrior |
20:15
🔗
|
|
duneman25 has joined #warrior |
20:18
🔗
|
|
lemoniter has joined #warrior |
20:19
🔗
|
|
notanick has joined #warrior |
20:21
🔗
|
|
qbrd has joined #warrior |
20:46
🔗
|
notanick |
Hi everyone, I have just set-up a warrior to help on the Google+ project. It's up and running now for 30 min or so, but I'm still not sure how exactly it works. |
20:47
🔗
|
notanick |
It seems like it uploads the pulled data immediately. What does it need the stated requirement of 60GB of available space for? Also, where exactly is the data uploaded to? Does it go to archive-team servers or somewhere else? |
20:47
🔗
|
tmg1|eva |
another one of my workers froze...leaving 1/6 remaining unfrozen. |
20:47
🔗
|
tmg1|eva |
notanick: i could be wrong, but the 60GB was from other projects |
20:48
🔗
|
tmg1|eva |
the average workunit seems in the 10s of mb |
20:48
🔗
|
tmg1|eva |
so really you only need about 100mb on average...but |
20:48
🔗
|
tmg1|eva |
there are workunits that are upwards of 30GiB each |
20:48
🔗
|
notanick |
Ah ok, that's what I thought but I wasn't sure |
20:48
🔗
|
tmg1|eva |
so if you have 6 workers you'd need 180GiB to be sure that if you got 6 of those |
20:48
🔗
|
tmg1|eva |
that it wouldn't break |
20:48
🔗
|
robbierut |
There are some items who can be bigger, so they need the space. But the vm only takes the space it needs, not ypu set it to |
20:49
🔗
|
tmg1|eva |
without more details (ie stddev stderr / CI) it's hard to know the likelihood of that |
20:50
🔗
|
robbierut |
I got a few of 20gbit items I think. But I'm running 10 vm's with 6 workers |
20:52
🔗
|
tmg1|eva |
robbierut: how are you measuring? |
20:52
🔗
|
robbierut |
Somebody spotted them for me in the tracker website running list |
20:53
🔗
|
notanick |
Ok, and where is the data uploaded to? The logs during the upload phase show a variety of domains. Do they belong to archive-team? (If yes, where did they get all their storage and bandwidth from?) |
20:53
🔗
|
JAA |
notanick: Even Google+ has some really large items; the largest I've seen was nearly 25 GB, but there have probably been larger ones. But yeah, 60 GB is rarely used normally. It's mostly the maximum size your warrior might use. The VM should only use as much disk space as it actually needs, though IIRC the disk image won't shrink back down again automatically in VirtualBox. |
20:55
🔗
|
robbierut |
Notanick, the warrior uploads data to a target. Targets combine files to 50gb megawarcs and upload those to the Internet Archive (IA) or to a storage buffer server maintained by the archive team |
20:58
🔗
|
notanick |
So I'm uploading to servers that cache the files which eventually upload to archive.org? Where do these targets come from? (sorry for asking so many questions) |
20:59
🔗
|
robbierut |
Yes |
20:59
🔗
|
robbierut |
The targets are paid and maintained by volunteers from the archive team |
21:00
🔗
|
robbierut |
Trustworthy people with enough time and skills to do that |
21:00
🔗
|
notanick |
ok |
21:01
🔗
|
qbrd |
how do you manage a fleet of warriors? |
21:01
🔗
|
notanick |
final question: What prevents someone from uploading junk data? |
21:01
🔗
|
qbrd |
I've built 8 so far, and they don't seem to honor `--env CONCURRENT_ITEMS=$(nproc)`, so I'm having to login to each one individually, set that, then restart the warrior... |
21:01
🔗
|
robbierut |
Actually, I dont do a lot qbrd. Once you saved the settings they work automatically. I only need to start the vm's and they do everything themselves |
21:01
🔗
|
qbrd |
it would be really nice to have a central point to handle that. |
21:02
🔗
|
yano |
the max CONCURRENT_ITEMS can be is 6 |
21:02
🔗
|
robbierut |
Oh, in virtualbox set a different port per vm. So you can access them all in a new tab in your browser |
21:03
🔗
|
robbierut |
Notanick I dont know actually, but why would you do that? I do guess there is something built in |
21:03
🔗
|
qbrd |
> yano> the max CONCURRENT_ITEMS can be is 6 |
21:03
🔗
|
qbrd |
ah, that could be my problem on the 8proc nodes. but what about the four core nodes? |
21:03
🔗
|
yano |
the gui website menu also shows that the max value is 6 |
21:04
🔗
|
qbrd |
and why is the max 6? Seems arbitrary. |
21:06
🔗
|
robbierut |
Its to not get an ip ban from the more strict websites and also to not overload your pc. I believe you can up it to 20 while running the normal script |
21:06
🔗
|
qbrd |
ah, interesting. |
21:06
🔗
|
qbrd |
neat! |
21:06
🔗
|
robbierut |
I am running 100 workers with no problem but someone here got an ip ban I believe for running to many connections to google |
21:06
🔗
|
robbierut |
Or at least a lot of error |
21:06
🔗
|
robbierut |
S |
21:07
🔗
|
notanick |
robbierut, I mean, I wouldn't. But I could imagine how someone somewhere could do that just to be annoying. |
21:08
🔗
|
yano |
if you use docker you can configure each warrior to use a different ip address |
21:08
🔗
|
yano |
and Google suports ipv6 |
21:08
🔗
|
yano |
*IPv6 |
21:09
🔗
|
robbierut |
True yano but some services block connections on a higher lvl than a indvidual ipv6. Dont know exactly but for ipv its like setting a max from 1.2.3.* instead of 1.2.3.4 |
21:10
🔗
|
yano |
true, some places block /64 for ipv6, or /112 if they are trying to be careful and not block too many people |
21:11
🔗
|
robbierut |
Yeah that |
21:12
🔗
|
robbierut |
But still, most people wont run into that limot |
21:13
🔗
|
|
lemoniter has quit IRC (Quit: Page closed) |
21:20
🔗
|
JAA |
I'm really surprised by that actually. This is not the first time we're trying to archive something from Google, and we've been rate-limited to hell before. |
21:22
🔗
|
robbierut |
Really? I'm running quite a long time with 100 workers. I only saw fusl today getting ip bans I think. |
21:22
🔗
|
robbierut |
My workers are working quite nicely |
21:23
🔗
|
SmileyG |
Maybe someone at google likes us |
21:23
🔗
|
SmileyG |
maybe they want a cake |
21:24
🔗
|
JAA |
Maybe they don't want bad press (cf. Tumblr). |
21:25
🔗
|
robbierut |
I heard something about a few people in g+ made it easier for us |
21:33
🔗
|
SmileyG |
well, they only need to shutdown the API's really |
21:33
🔗
|
SmileyG |
that's the 'bad' bit |
21:34
🔗
|
robbierut |
Yeah, but good reason to close the thing and save some money for something else |
21:34
🔗
|
robbierut |
And besides money, talented people |
21:36
🔗
|
|
notanick has quit IRC (Remote host closed the connection) |
21:38
🔗
|
|
KoalaBear has joined #warrior |
22:23
🔗
|
marked |
robbierut: where did you hear about g+ helping us out? |
22:23
🔗
|
robbierut |
Sonewhere on the irc. That someone was closing error messages on their side |
22:48
🔗
|
KoalaBear |
I've heard the same, but I'm not sure when/where/who |
22:57
🔗
|
|
phiresky has joined #warrior |