[03:00] <nyany> I think One of my warriors was on the Yourshot project and when I went to check on it earlier it seems it is doing URLTeam again
[03:00] <nyany> Even though I didn’t select “ArchiveTeam’s choice”
[03:00] <nyany> Could this happen if the project is removed and readded?
[07:45] *** d5f4a3622 has quit IRC (Ping timeout: 612 seconds)
[08:32] *** d5f4a3622 has joined #warrior
[09:34] *** mls_ has quit IRC (Remote host closed the connection)
[09:39] *** VADemon_ has joined #warrior
[09:39] *** mls_ has joined #warrior
[09:43] *** VADemon has quit IRC (Ping timeout: 258 seconds)
[09:45] *** VADemon_ has quit IRC (Quit: left4dead)
[11:09] *** d5f4a3622 has quit IRC (Ping timeout: 246 seconds)
[15:08] <chfoo> it shouldn't be automatically running a project without your permission. might be a bug
[15:37] <nyany> chfoo: i suspected as much. I've restarted the yourshot project and will report back if it happens again
[16:23] *** drafthors has joined #warrior
[16:40] <markedL> drafthors, for example we have yourshot going on but cant make it default because that would be too many an once 
[17:15] <drafthors> markedL: I see. so this would be something I could point one of the machines towards, or both?
[17:17] <drafthors> the url shorteners don't take much bandwidth, so I could conceivable start a second instance of the warrior on one of the machines
[17:22] <drafthors> you know, I don't want to sound like a smartass, especially since I've literally just arrived here, but it occurs to me that your job scheduler is not utilizing the system resources effectively
[17:23] <drafthors> you allow up to six parallel tasks, but the system limits me to a single project at a time
[17:24] <drafthors> and on top of that, the tasks slots are blocked with busy wait a lot of time, albeit this might be due to the load of work generated for the url shorteners
[17:25] <drafthors> I've in the past donated a lot of runtime to the worldcommunitygrid, and their job scheduler seemed to perform a lot smoother
[17:26] <drafthors> I could maybe help here, I don't have much time, but I've got a strong operating systems background (PhD candidate and lecturer)
[17:27] <drafthors> so scheduling is kind of my bread and butter :)
[17:28] <markedL> that would be nice.  someone in the other room said they deployed on a pi zero so we're getting a bigger range whether it was allowed or not. 
[17:32] <markedL> most of our jobs will be busy wait even when we have a lot going on,  the worst the CPU gets is HTML parsing 
[17:36] <markedL> are there any network oriented projects in BOINC ? 
[17:37] <JAA> drafthors: Well, in the case of URLTeam, the limit is there so that you don't get your IP banned. There are also global limits (i.e. across all workers) to not overload the shorteners; yes, most of them suck and start throwing errors quite quickly. The solution there is of course to add more shorteners to the project to distribute the load, but that requires work, and AT is understaffed in many departments, 
[17:37] <JAA> to say it in management speak.
[17:38] <JAA> Similar things are going on with other projects often, e.g. Your Shot collapses if we throw more load at it.
[17:42] <drafthors> I don't have a problem with the individual project limits, I just feel like they shouldn't cause my system to idle for 120 seconds.
[17:42] <drafthors> if one project has to wait, why not do something else in the meantime
[17:43] <astrid> back in the bad old days i could bring down shorteners with a single perl script
[17:43] <astrid> that's before any of them had per-ip limits
[17:51] <Wingy> Why does each warrior only allow up to 6 threads? I have 24 cores and 100mbps down and would like to run maybe 8-12 threads. Is it a safety measure?
[17:54] <markedL> we're bottlenecked in the programming department
[18:00] <drafthors> the warrior is open source, right?
[18:01] <markedL> worker and server side are both open source on github
[18:02] <Wingy> Is it the warrior code 2 repo?
[18:05] <Wingy> warrior-code2 seems to just be an installer
[18:06] <markedL> drafthors was more asking about server side, so that's https://github.com/ArchiveTeam/universal-tracker
[18:06] <markedL> warriors are mostly https://github.com/ArchiveTeam/seesaw-kit +
[18:09] <drafthors> that's quite the gumbo of frameworks and technologies :D
[18:09] <Wingy> Yeah I thought it might be that but I couldn't find the warrior project picker
[18:09] <markedL> a build system, I'm not sure how warrior3 is built actually. 
[18:10] <Wingy> This is the only thing stopping higher than 20 threads https://github.com/ArchiveTeam/seesaw-kit/blob/699b0d215768c2208b5b48844c9f0f75bd6a1cbc/seesaw/script/run_pipeline.py#L57
[18:10] <Wingy> I think I can edit that to use more threads
[18:12] <markedL> right now we make people who want more than 20, to orchestrate it themselves.  I observe it kinda serves as an do you know what you're doing test, even if it wasn't meant to be.
[18:13] <Wingy> I probably don't need more than 20 though
[18:18] *** d5f4a3622 has joined #warrior
[18:22] <JAA> Do not edit anything, ever.
[18:22] <JAA> The limit of 20 is there for a reason, namely there are bugs in seesaw at high concurrencies.
[18:23] <JAA> If you really need more (e.g. because you have many IPs bound to one powerful machine), run the pipeline multiple times.
[18:23] <JAA> You can probably run the processes in the same directory, but to be safe, I'd use separate ones.
[18:24] <JAA> Wingy: ^
[18:24] <Wingy> Okay thanks
[18:37] <markedL> yeah, as general statement there's bugs in lots of people systems at high concurrency.  that's how I often do bug testing is just turn up the throttle 
[18:47] <JAA> That's basically the "does it blend?" of softwares.
[20:29] *** drafthors has quit IRC (Ping timeout: 260 seconds)
[20:47] *** nepeat has quit IRC (Read error: Operation timed out)
[20:55] *** nepeat has joined #warrior
[21:09] *** coderobe9 is now known as coderobe