[00:00] It's visible when running the scripts as well [00:01] i'm wondering if its because of this: https://github.com/ArchiveTeam/terroroftinytown-client-grab/blob/f14658cbac494883dc6bb0038d4546354f5f3345/scraper.py#L32 [00:02] i'm pretty new on how all this works, but i noticed that seesaw is basically executing `python.exe pipeline.py` so i'm wondering if the logger is writing that error to stdout and then seesaw is seeing that as an error? [00:03] its intermittent because it works most of the time but i've seen it happen like 5 or so times in a day [00:04] I have dozens of these in my scrollback. [00:04] No idea how far back that goes since there are no timestamps anywhere in the output, but I guess it's a couple hours maybe? [00:04] err maybe not that flush call specifically [00:05] suspicion is that the script (correctly) just throws an exception and bails (terroroftinytown-client-grab/pipeline.py) when it can't contact the tracker [00:05] and with no logging set up it just logs everything to stdout, so i'm wondering this is a seesaw / 'warrior web UI' bug or something [00:06] i'm not sure what the 'contract' is for the pipeline scripts that get run by seesaw on what to do / not do in an error [00:06] I think it's just this: https://github.com/ArchiveTeam/terroroftinytown-client-grab/blob/f14658cbac494883dc6bb0038d4546354f5f3345/scraper.py#L74 [00:07] I.e. if one scraper.py process fails to get an item with a 404 five times, it barks. [00:07] possibly, its hard to tell what is happening because i think seesaw is logging the output of the "pipeline subprocess" as if it were its own, heh [00:08] I don't think seesaw is doing anything with regards to that. It just passes the stdout and stderr FDs on to the child process probably. [00:08] to run it without the warrior, is it just `run-pipeline pipeline.py --concurrent 2 YOURNICKHERE` ? with seesaw installed? [00:09] (hard to edit / test stuff out with the docker image) [00:11] I think so, yeah. There's something about automatic updating also, but otherwise, that's it. Possibly also --disable-web-client if you don't want the web interface for that pipeline (required if you run more than one pipeline on one machine unless you change ports somehow). [00:15] I think the proper way to fix this would be to replace `six.raise_from` with a proper `raise TrackerError(...) from error` at https://github.com/ArchiveTeam/terroroftinytown/blob/7c0093ba8b3622d1f6198188b1dd535e6698bf5d/terroroftinytown/client/tracker.py#L30 and then inspect the chained exception in scraper.py and just `sys.exit(1)` or something instead of `raise` if it's a 404. [00:16] `raise from` only works on Python 3, but 2 is EOL anyway finally. [00:18] Alternatively, the original exception could be stored as an attribute in TrackerError, but exception chaining is cleaner in my opinion. [00:18] isn't six meant to handle python 2 and 3 stuff? so six.raise_from probably does 'raise from' on python3 [00:19] i think it is correctly determining it is a TrackerError, so all it needs to do is just do sys.exit(1) or something [00:20] Yes and yes. `six.raise_from(a, b)` is equivalent to `raise a from b`. But on Python 2, it's simply `raise a` since there is no way to do the chaining. [00:20] i dunno what the separation of responsibilities here is, i assume seesaw is the arbiter of how many times stuff gets run so i would feel like the pipeline shouldn't even have logic for repeated failures [00:20] So if you need the original exception, you can't do that with Python 2. [00:20] ah, is the docker container running it under python2? [00:21] I hope not, but I have no idea. [00:22] i feel like the docker image needs a refresh, i got mega frustrated cause i was trying to run the mercurial project and that needs to compile `wget-lua` and that needs libraries that you can't get on the linux distro that the docker image is on [00:24] according to the warrior code i think it uses python3 [00:24] https://github.com/ArchiveTeam/warrior-code2/blob/development/warrior-runner.sh [00:31] *** kiska1825 has joined #urlteam [00:31] *** kiska1825 has quit IRC (Client Quit) [00:32] *** kiska1825 has joined #urlteam [00:32] *** Ryz has joined #urlteam [00:42] and @JAA i tried to find the spot to add timestamps to the log message but they are all print() statements =( sorry [00:44] well, not all of them, i think some of them are print, it seems stuff like `OwlyService` are using logging [00:44] Right, and the default logging output is just the message. [00:58] *** acitrin has joined #urlteam [00:58] hello [00:59] is anyone here? [00:59] *** acitrin has quit IRC (Client Quit) [01:05] hi. oh he left [01:10] How dare you not respond within 58 seconds?! [01:11] Er, 42 actually. [01:11] the tracker for urlteam2 seems to be returning a ton of 502 errors today [01:43] *** mgrandi has quit IRC (Read error: Operation timed out) [03:45] *** mgrandi has joined #urlteam [12:39] Tracker needs a restart. Can't log into the admin interface [13:07] chfoo, Kaz: ^ [13:50] *** Terbium_ has joined #urlteam [14:02] *** Terbium has quit IRC (se.hub efnet.deic.eu) [14:20] *** Ryz has quit IRC (Remote host closed the connection) [14:20] *** kiska1825 has quit IRC (Remote host closed the connection) [14:21] *** Ryz has joined #urlteam [14:21] *** kiska1825 has joined #urlteam [15:40] i think someone restarted it already? it's slow, but looks ok [15:42] also, the pipeline script eventually errors out because it was the simplist way to eventually exit if the user requested the warrior to stop or change projects [17:11] *** mgrandi has quit IRC (Leaving) [19:38] *** Ryz has quit IRC (Remote host closed the connection) [19:38] *** kiska1825 has quit IRC (Remote host closed the connection) [19:39] *** Ryz has joined #urlteam [19:39] *** kiska1825 has joined #urlteam [20:02] *** n9nes has joined #urlteam [20:07] *** n9nes- has quit IRC (Read error: Operation timed out) [23:39] *** kiska1825 has quit IRC (Remote host closed the connection) [23:39] *** Ryz has quit IRC (Remote host closed the connection) [23:39] *** kiska1825 has joined #urlteam [23:40] *** Ryz has joined #urlteam