#urlteam 2020-07-02,Thu

↑back Search

Time Nickname Message
00:00 🔗 JAA It's visible when running the scripts as well
00:01 🔗 mgrandi i'm wondering if its because of this: https://github.com/ArchiveTeam/terroroftinytown-client-grab/blob/f14658cbac494883dc6bb0038d4546354f5f3345/scraper.py#L32
00:02 🔗 mgrandi i'm pretty new on how all this works, but i noticed that seesaw is basically executing `python.exe pipeline.py` so i'm wondering if the logger is writing that error to stdout and then seesaw is seeing that as an error?
00:03 🔗 mgrandi its intermittent because it works most of the time but i've seen it happen like 5 or so times in a day
00:04 🔗 JAA I have dozens of these in my scrollback.
00:04 🔗 JAA No idea how far back that goes since there are no timestamps anywhere in the output, but I guess it's a couple hours maybe?
00:04 🔗 mgrandi err maybe not that flush call specifically
00:05 🔗 mgrandi suspicion is that the script (correctly) just throws an exception and bails (terroroftinytown-client-grab/pipeline.py) when it can't contact the tracker
00:05 🔗 mgrandi and with no logging set up it just logs everything to stdout, so i'm wondering this is a seesaw / 'warrior web UI' bug or something
00:06 🔗 mgrandi i'm not sure what the 'contract' is for the pipeline scripts that get run by seesaw on what to do / not do in an error
00:06 🔗 JAA I think it's just this: https://github.com/ArchiveTeam/terroroftinytown-client-grab/blob/f14658cbac494883dc6bb0038d4546354f5f3345/scraper.py#L74
00:07 🔗 JAA I.e. if one scraper.py process fails to get an item with a 404 five times, it barks.
00:07 🔗 mgrandi possibly, its hard to tell what is happening because i think seesaw is logging the output of the "pipeline subprocess" as if it were its own, heh
00:08 🔗 JAA I don't think seesaw is doing anything with regards to that. It just passes the stdout and stderr FDs on to the child process probably.
00:08 🔗 mgrandi to run it without the warrior, is it just `run-pipeline pipeline.py --concurrent 2 YOURNICKHERE` ? with seesaw installed?
00:09 🔗 mgrandi (hard to edit / test stuff out with the docker image)
00:11 🔗 JAA I think so, yeah. There's something about automatic updating also, but otherwise, that's it. Possibly also --disable-web-client if you don't want the web interface for that pipeline (required if you run more than one pipeline on one machine unless you change ports somehow).
00:15 🔗 JAA I think the proper way to fix this would be to replace `six.raise_from` with a proper `raise TrackerError(...) from error` at https://github.com/ArchiveTeam/terroroftinytown/blob/7c0093ba8b3622d1f6198188b1dd535e6698bf5d/terroroftinytown/client/tracker.py#L30 and then inspect the chained exception in scraper.py and just `sys.exit(1)` or something instead of `raise` if it's a 404.
00:16 🔗 JAA `raise from` only works on Python 3, but 2 is EOL anyway finally.
00:18 🔗 JAA Alternatively, the original exception could be stored as an attribute in TrackerError, but exception chaining is cleaner in my opinion.
00:18 🔗 mgrandi isn't six meant to handle python 2 and 3 stuff? so six.raise_from probably does 'raise from' on python3
00:19 🔗 mgrandi i think it is correctly determining it is a TrackerError, so all it needs to do is just do sys.exit(1) or something
00:20 🔗 JAA Yes and yes. `six.raise_from(a, b)` is equivalent to `raise a from b`. But on Python 2, it's simply `raise a` since there is no way to do the chaining.
00:20 🔗 mgrandi i dunno what the separation of responsibilities here is, i assume seesaw is the arbiter of how many times stuff gets run so i would feel like the pipeline shouldn't even have logic for repeated failures
00:20 🔗 JAA So if you need the original exception, you can't do that with Python 2.
00:20 🔗 mgrandi ah, is the docker container running it under python2?
00:21 🔗 JAA I hope not, but I have no idea.
00:22 🔗 mgrandi i feel like the docker image needs a refresh, i got mega frustrated cause i was trying to run the mercurial project and that needs to compile `wget-lua` and that needs libraries that you can't get on the linux distro that the docker image is on
00:24 🔗 mgrandi according to the warrior code i think it uses python3
00:24 🔗 mgrandi https://github.com/ArchiveTeam/warrior-code2/blob/development/warrior-runner.sh
00:31 🔗 kiska1825 has joined #urlteam
00:31 🔗 kiska1825 has quit IRC (Client Quit)
00:32 🔗 kiska1825 has joined #urlteam
00:32 🔗 Ryz has joined #urlteam
00:42 🔗 mgrandi and @JAA i tried to find the spot to add timestamps to the log message but they are all print() statements =( sorry
00:44 🔗 mgrandi well, not all of them, i think some of them are print, it seems stuff like `OwlyService` are using logging
00:44 🔗 JAA Right, and the default logging output is just the message.
00:58 🔗 acitrin has joined #urlteam
00:58 🔗 acitrin hello
00:59 🔗 acitrin is anyone here?
00:59 🔗 acitrin has quit IRC (Client Quit)
01:05 🔗 Jake hi. oh he left
01:10 🔗 JAA How dare you not respond within 58 seconds?!
01:11 🔗 JAA Er, 42 actually.
01:11 🔗 mgrandi the tracker for urlteam2 seems to be returning a ton of 502 errors today
01:43 🔗 mgrandi has quit IRC (Read error: Operation timed out)
03:45 🔗 mgrandi has joined #urlteam
12:39 🔗 phuzion Tracker needs a restart. Can't log into the admin interface
13:07 🔗 JAA chfoo, Kaz: ^
13:50 🔗 Terbium_ has joined #urlteam
14:02 🔗 Terbium has quit IRC (se.hub efnet.deic.eu)
14:20 🔗 Ryz has quit IRC (Remote host closed the connection)
14:20 🔗 kiska1825 has quit IRC (Remote host closed the connection)
14:21 🔗 Ryz has joined #urlteam
14:21 🔗 kiska1825 has joined #urlteam
15:40 🔗 chfoo i think someone restarted it already? it's slow, but looks ok
15:42 🔗 chfoo also, the pipeline script eventually errors out because it was the simplist way to eventually exit if the user requested the warrior to stop or change projects
17:11 🔗 mgrandi has quit IRC (Leaving)
19:38 🔗 Ryz has quit IRC (Remote host closed the connection)
19:38 🔗 kiska1825 has quit IRC (Remote host closed the connection)
19:39 🔗 Ryz has joined #urlteam
19:39 🔗 kiska1825 has joined #urlteam
20:02 🔗 n9nes has joined #urlteam
20:07 🔗 n9nes- has quit IRC (Read error: Operation timed out)
23:39 🔗 kiska1825 has quit IRC (Remote host closed the connection)
23:39 🔗 Ryz has quit IRC (Remote host closed the connection)
23:39 🔗 kiska1825 has joined #urlteam
23:40 🔗 Ryz has joined #urlteam

irclogger-viewer