[14:32] GLaDOS: sure, yeah [14:34] ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC5/yScdTEOtHvkwh92s4Ry4I+gUfk3UC/+6M4LuM/kdAF69QONR4JLyR9baesCOrj64ajvlCYFwWJaP1/tMLup2ECCTvtEpazh0Jp0/iFLLb+kJVWqKxpbf6qqWihW3mErQqUxgdkJ05GhPC8DjoBY9EI01f2JuOWLdJP0Iw9mnt/T8hEmzh5VTeL9m3/+UJ+KXQRGlH811IkOTHVILD+DaoVBPH+W1O8LMjfH6O3hJFWksHmACSshhj2xyvY3Xrpc/bp32dKChVv2NDITh/iAEf3mlZ7drsu/Dw3ikLy9/kDhI5Z29XhMv2XeQTi3HSU+eBUM3AJlORvyI8v7/moN ed@prajna.home [14:37] Okay, try connecting to urlteam@urlteam.terrywri.st [14:45] It's running in a tmux session [14:48] ok, i'm there [14:48] can definitely see the warrior traffic with varnishncsa [14:48] :-) [14:49] tmux uses ctrl-b instead of ctrl-a like screen? [14:50] is it ok if i enable varnish logging? [14:53] well i did it, you should see /var/log/varnish/varnishncsa.log now [14:55] Yeah, it uses ^B [14:55] Also yeah, go ahead [14:56] :O [14:56] warriror vor urlvoid [14:56] uh urltean? [14:57] I like [14:57] so can i start up the tracker again, so we can see what errors we might get? [14:57] i still haven't seen any on my dev instances after thousands of tasks from 5-6 clients [14:58] jeezum there are 39732 files in /home/urlteam/tinyarchive/tracker/files [14:59] that's where the task results are stored temporarily before they are processed by other bits of the code i haven't looked at yet [15:03] looks like one of them is corrupted, won't uncompress tracker/files/tmp025hl6 [15:04] * edsu is counting how many url mappings are in tracker/files/* [15:04] zcat tracker/files/* | wc -l [15:04] You can, it should be in the history for the top tmux window [15:04] 609,394 [15:04] i wonder if that corrupted gzip file was messing things up [15:05] ok, i'll restart [15:05] Also, possibly. [15:05] i think i can remove the task for that file [15:05] from the sqlite db [15:05] ok, I have an tinyback instance running now :) [15:07] ok, i remove that task associated with the corrupted file [15:07] edsu: notice the initial influx of people? [15:07] oh wait, you will [15:07] i haven't started it up yet [15:08] i'll do that now :-) [15:08] oh god, that's not the version of tmux i think it is [15:08] although it might be nice to let it clean up what it has before opening up to the world? [15:08] Or maybe just try exporting it all? [15:08] so what commands would you normally do when you remembered? [15:08] or was something else hitting the admin urls? [15:09] It was just a run of cleanup.py every so often [15:09] ok, how about we try that first before opening up to the world? [15:10] i can bring it up on another port and we can issue cleanup? [15:11] You could just take varnish down.. [15:12] alright i'll do that [15:12] actually i'll just use a different port [15:12] right now varnish is giving a nice 503 error [15:13] which is better for the warriors than some kind of connection refused error ; although maybe it doesn't matter [15:13] i can already see the problem :) [15:13] i brought it up on :7091 [15:16] oops, i mean :8000 [15:17] doing an /admin/cleanup now [15:18] still have the files in tinyarchive/tracker/files [15:22] i'm still confused about where the data goes after tasks have been stored [15:23] is there something that comes along periodically and collects up the data via the web api? [15:24] is David Triendl still around? [15:26] soultcer: ping :) [15:27] i suspect that there is something that comes along and collects stuff [15:31] fetch_finished is supposed to be able to do it [15:33] oh [15:33] you want to try running that? [15:34] the tracker is running at http://localhost:8000 [15:34] still [15:35] from the varnishncsa log it looks like there are 37 active warriors [15:36] i can run fetch_finished i guess [15:39] seems to be working [15:43] waiting for it to stop :) [15:47] you didn't define an output directory, did you? [15:47] no, so it is going in '.' :-) [15:47] i can move them elsewhere when it's done if you want [15:47] Ah, true/ [15:47] where do you typically dump them? [15:47] Funny thing, never was able to dump them [15:48] yeah, it uses the web app [15:48] ah [15:48] the tracker api, so if the tracker was frozen, script would get jammed up too i guess [15:48] well, im going to sleep [15:48] have fun, i guess. [15:48] what would you do with the files when they are done normally? [15:48] ok, where are you btw? [15:49] Perth, 'straya [15:49] nice [15:49] i'm ehs@pobox.com btw [15:49] I'd just leave them in a directory. I have a copy of the URLTeam torrent locally [15:49] if async communication works [15:49] ah yeah [15:49] i rarely use email [15:49] so you normally create the files, and then create a torrent of it [15:49] ? [15:50] Yeah, we have scripts in the same directory for importing them into the torrent db [15:50] does the torrent get put up on internet archive i guess? [15:50] The torrent doesn't get put up AFAIK, but we give a copy to 301works [15:50] gotcha [15:51] night [15:51] which script does the torrent db thing? [15:51] g'night [15:51] release_import [15:51] thnx [15:51] and stuff [19:15] so is the real tracker back up? [20:07] Hello [20:07] Hello [20:18] it's not back up yet [20:18] getting close, the fetch_finished finally finished [20:19] now to run release_import [20:20] fetch_finished kept failing because there were 5 tasks that looked to be complete but lacked task files [20:20] which caused it to abort [20:30] might have to wait for GLaDOS to return [20:31] i don't think release_import gets run after fetch_finished [20:31] think there needs to be a call to create_release.py [20:32] while the code is quite clean [20:32] the process for creating a release does need better documentation, and perhaps automation [20:41] pft: ok, i started it back up [20:41] figure the release can wait, perhaps [20:41] immediately i see a bunch of errors about the database being locked :) [20:41] although some requests seem to be working ok [20:42] might make sense to move from sqlite to mysql/postgres [20:42] ugh it's using sqlite? [20:42] yeah [20:43] bluh [20:43] based on the number of clients that are probably running it is most likely spending its life blocking, then [20:43] right, yeah [20:45] it's not actually submitting the sqlite files anywhere, is it? [20:45] it didn't seem like a complicated schema when i glanced at the source earlier this week [20:45] eventually they are dumped to a file that is torrented [20:45] the schema is very light weight, the submitted url mappings are stored in gzipped text files [20:46] so it purges the sqlite file once they've dumped to a file? [20:46] it just flags them as 'deleted' [20:46] it doesn't actually delete them from the db [20:46] ahhh ok [20:47] so the database size does grow to inifinity ;) [20:47] lots of 500 errors now [20:47] i guess, but there isn't a row for every mapping [20:47] so maybe it's not too bad [20:47] hmm ok [20:47] i'm happy to modify it to use mysql or postgresql but i guess we need to know what GLaDOS would be happier running [21:02] looks like it might be a one line change [21:02] thanks to the abstraction web.py gives you [21:04] nice! [21:05] do archiveteam folks tend to prefer mysql to postgres or the other way around? [21:05] i guess it is a question for GlaDOS [21:05] that i don't know [21:05] yeah [21:05] when he is around [21:06] i prefer postgres, but actually use mysql more (due to constraints at work) [21:06] * edsu shrugs [21:06] yeah, i prefer postgres [21:08] though i mostly use mysql [21:10] here's the line btw [21:10] https://github.com/ArchiveTeam/tinyarchive/blob/master/tracker/tracker.py#L36 [21:10] well [21:11] once you make create scripts and get the data types correct then that's the line that needs to change ;) [21:28] yeah, i think the create script is just sql [21:28] didn't seem to be particularly sqlite specific [21:29] https://github.com/ArchiveTeam/tinyarchive/blob/master/tracker/schema.sql [21:29] right? [21:29] might be some stuff to strip out [21:30] ON CONFLICT IGNORE sticks out [21:31] but other than that, seems pretty standardish? [21:44] i'm not sure about "TEXT" as a field type [21:44] i guess that is ok [21:45] When an applicable constraint violation occurs, the IGNORE resolution algorithm skips the one row that contains the constraint violation and continues processing subsequent rows of the SQL statement as if nothing went wrong. [21:45] hahaha NOTHING TO SEE HERE! MOVE ALONG! [21:46] yeah this might actually be as easy as you said initially [21:49] Hey, http://urlte.am/ says you guys need a server for storing / seeding the release torrent. I've got a dedicated server with 1Gbps that isn't doing much right now, so I could definitely seed that torrent pretty well. Is there anything else you guys could use a dedicated server for?