Time |
Nickname |
Message |
00:00
🔗
|
JAA |
It's visible when running the scripts as well |
00:01
🔗
|
mgrandi |
i'm wondering if its because of this: https://github.com/ArchiveTeam/terroroftinytown-client-grab/blob/f14658cbac494883dc6bb0038d4546354f5f3345/scraper.py#L32 |
00:02
🔗
|
mgrandi |
i'm pretty new on how all this works, but i noticed that seesaw is basically executing `python.exe pipeline.py` so i'm wondering if the logger is writing that error to stdout and then seesaw is seeing that as an error? |
00:03
🔗
|
mgrandi |
its intermittent because it works most of the time but i've seen it happen like 5 or so times in a day |
00:04
🔗
|
JAA |
I have dozens of these in my scrollback. |
00:04
🔗
|
JAA |
No idea how far back that goes since there are no timestamps anywhere in the output, but I guess it's a couple hours maybe? |
00:04
🔗
|
mgrandi |
err maybe not that flush call specifically |
00:05
🔗
|
mgrandi |
suspicion is that the script (correctly) just throws an exception and bails (terroroftinytown-client-grab/pipeline.py) when it can't contact the tracker |
00:05
🔗
|
mgrandi |
and with no logging set up it just logs everything to stdout, so i'm wondering this is a seesaw / 'warrior web UI' bug or something |
00:06
🔗
|
mgrandi |
i'm not sure what the 'contract' is for the pipeline scripts that get run by seesaw on what to do / not do in an error |
00:06
🔗
|
JAA |
I think it's just this: https://github.com/ArchiveTeam/terroroftinytown-client-grab/blob/f14658cbac494883dc6bb0038d4546354f5f3345/scraper.py#L74 |
00:07
🔗
|
JAA |
I.e. if one scraper.py process fails to get an item with a 404 five times, it barks. |
00:07
🔗
|
mgrandi |
possibly, its hard to tell what is happening because i think seesaw is logging the output of the "pipeline subprocess" as if it were its own, heh |
00:08
🔗
|
JAA |
I don't think seesaw is doing anything with regards to that. It just passes the stdout and stderr FDs on to the child process probably. |
00:08
🔗
|
mgrandi |
to run it without the warrior, is it just `run-pipeline pipeline.py --concurrent 2 YOURNICKHERE` ? with seesaw installed? |
00:09
🔗
|
mgrandi |
(hard to edit / test stuff out with the docker image) |
00:11
🔗
|
JAA |
I think so, yeah. There's something about automatic updating also, but otherwise, that's it. Possibly also --disable-web-client if you don't want the web interface for that pipeline (required if you run more than one pipeline on one machine unless you change ports somehow). |
00:15
🔗
|
JAA |
I think the proper way to fix this would be to replace `six.raise_from` with a proper `raise TrackerError(...) from error` at https://github.com/ArchiveTeam/terroroftinytown/blob/7c0093ba8b3622d1f6198188b1dd535e6698bf5d/terroroftinytown/client/tracker.py#L30 and then inspect the chained exception in scraper.py and just `sys.exit(1)` or something instead of `raise` if it's a 404. |
00:16
🔗
|
JAA |
`raise from` only works on Python 3, but 2 is EOL anyway finally. |
00:18
🔗
|
JAA |
Alternatively, the original exception could be stored as an attribute in TrackerError, but exception chaining is cleaner in my opinion. |
00:18
🔗
|
mgrandi |
isn't six meant to handle python 2 and 3 stuff? so six.raise_from probably does 'raise from' on python3 |
00:19
🔗
|
mgrandi |
i think it is correctly determining it is a TrackerError, so all it needs to do is just do sys.exit(1) or something |
00:20
🔗
|
JAA |
Yes and yes. `six.raise_from(a, b)` is equivalent to `raise a from b`. But on Python 2, it's simply `raise a` since there is no way to do the chaining. |
00:20
🔗
|
mgrandi |
i dunno what the separation of responsibilities here is, i assume seesaw is the arbiter of how many times stuff gets run so i would feel like the pipeline shouldn't even have logic for repeated failures |
00:20
🔗
|
JAA |
So if you need the original exception, you can't do that with Python 2. |
00:20
🔗
|
mgrandi |
ah, is the docker container running it under python2? |
00:21
🔗
|
JAA |
I hope not, but I have no idea. |
00:22
🔗
|
mgrandi |
i feel like the docker image needs a refresh, i got mega frustrated cause i was trying to run the mercurial project and that needs to compile `wget-lua` and that needs libraries that you can't get on the linux distro that the docker image is on |
00:24
🔗
|
mgrandi |
according to the warrior code i think it uses python3 |
00:24
🔗
|
mgrandi |
https://github.com/ArchiveTeam/warrior-code2/blob/development/warrior-runner.sh |
00:31
🔗
|
|
kiska1825 has joined #urlteam |
00:31
🔗
|
|
kiska1825 has quit IRC (Client Quit) |
00:32
🔗
|
|
kiska1825 has joined #urlteam |
00:32
🔗
|
|
Ryz has joined #urlteam |
00:42
🔗
|
mgrandi |
and @JAA i tried to find the spot to add timestamps to the log message but they are all print() statements =( sorry |
00:44
🔗
|
mgrandi |
well, not all of them, i think some of them are print, it seems stuff like `OwlyService` are using logging |
00:44
🔗
|
JAA |
Right, and the default logging output is just the message. |
00:58
🔗
|
|
acitrin has joined #urlteam |
00:58
🔗
|
acitrin |
hello |
00:59
🔗
|
acitrin |
is anyone here? |
00:59
🔗
|
|
acitrin has quit IRC (Client Quit) |
01:05
🔗
|
Jake |
hi. oh he left |
01:10
🔗
|
JAA |
How dare you not respond within 58 seconds?! |
01:11
🔗
|
JAA |
Er, 42 actually. |
01:11
🔗
|
mgrandi |
the tracker for urlteam2 seems to be returning a ton of 502 errors today |
01:43
🔗
|
|
mgrandi has quit IRC (Read error: Operation timed out) |
03:45
🔗
|
|
mgrandi has joined #urlteam |
12:39
🔗
|
phuzion |
Tracker needs a restart. Can't log into the admin interface |
13:07
🔗
|
JAA |
chfoo, Kaz: ^ |
13:50
🔗
|
|
Terbium_ has joined #urlteam |
14:02
🔗
|
|
Terbium has quit IRC (se.hub efnet.deic.eu) |
14:20
🔗
|
|
Ryz has quit IRC (Remote host closed the connection) |
14:20
🔗
|
|
kiska1825 has quit IRC (Remote host closed the connection) |
14:21
🔗
|
|
Ryz has joined #urlteam |
14:21
🔗
|
|
kiska1825 has joined #urlteam |
15:40
🔗
|
chfoo |
i think someone restarted it already? it's slow, but looks ok |
15:42
🔗
|
chfoo |
also, the pipeline script eventually errors out because it was the simplist way to eventually exit if the user requested the warrior to stop or change projects |
17:11
🔗
|
|
mgrandi has quit IRC (Leaving) |
19:38
🔗
|
|
Ryz has quit IRC (Remote host closed the connection) |
19:38
🔗
|
|
kiska1825 has quit IRC (Remote host closed the connection) |
19:39
🔗
|
|
Ryz has joined #urlteam |
19:39
🔗
|
|
kiska1825 has joined #urlteam |
20:02
🔗
|
|
n9nes has joined #urlteam |
20:07
🔗
|
|
n9nes- has quit IRC (Read error: Operation timed out) |
23:39
🔗
|
|
kiska1825 has quit IRC (Remote host closed the connection) |
23:39
🔗
|
|
Ryz has quit IRC (Remote host closed the connection) |
23:39
🔗
|
|
kiska1825 has joined #urlteam |
23:40
🔗
|
|
Ryz has joined #urlteam |