[12:57] xmc: i'm a software developer, if there are any tasks that need doing to help get the tracker working [12:57] xmc: just offering to help, if any is needed :) [13:57] Well, the whole thing is kind of in need of being rewritten edsu. [13:58] https://github.com/ArchiveTeam/tinyback go at it [13:59] er, https://github.com/ArchiveTeam/tinyarchive [14:01] though patching up whatever keeps breaking the tracker would probably be low-hanging fruit [15:04] That's the thing; the entire thing is patches. [15:05] Uh, not quite. [15:05] many thansk for ops [15:06] But it is mostly a prototype gone wild. :) [15:06] Well from how soult described it, it sounded like that [15:06] It's hacky, sure - but not "patches" [15:06] Eh, close enough. [15:09] so i'm still learning my way around ; is the urlteam tracker different from the archiveteam tracker? [15:10] yes [15:11] Indeed [15:12] https://github.com/ArchiveTeam/tinyarchive is the urltream tracker. [15:12] https://github.com/ArchiveTeam/tinyback is the urlteam "worker" [15:18] got it [15:19] i imagine those berkeleydbs are getting quite large? or are they purged periodically and saved off to a file? [15:19] yeah [15:20] * edsu is still reading obv :) [15:20] purged periodically [15:20] I got some infos that aren't in the source code, that are from soultcer (author)'s from this chat [15:21] Oh, yeah, they get purged [15:21] WHEN THE PURGING WORKS [15:21] :-D [15:21] is that the main problem? [15:21] Possibly [15:21] The tracker either just freezes up or gets killed [15:21] Literally, just killed. [15:21] No traceback [15:21] is it running on a vm somewhere? [15:22] oom killer killing it because it is using too much memory? [15:22] Nah, it's on some crappy kimsufi dedi [15:22] huh, ok [15:22] And the killing stopped like a week ago, it just freezes [15:27] looks like it's using web.py is it running as a single process or something else like apache/mod-wsgi? [15:30] * edsu wonders how big the sqlite db is [15:36] edsu: well it's launched by running tracker.py, so unless that creates multiple web.py processes.. [15:37] Also, "45M tasks.sqlite" [15:41] got it, any chance of getting a copy of the db? [15:43] http://dumpground.archivingyoursh.it/tasks.sqlite [15:44] <_46bit> What am I downloading? [15:46] An sqlite database which is probably corrupted beyond repaid [15:46] but meh [15:48] sqlite file opens at least :) [15:48] sqlite> select count(*) from task; [15:48] 191000 [15:49] THAT'S THE CAUSE! [15:50] ? [15:50] probably [15:50] hopefully [15:50] it's my fault, isn't it [15:50] nah [15:50] i haven't dived into the internals yet [15:50] wanted to keep my sanity [15:50] cause is too many rows? [15:50] maybe, no idea. [15:51] the python i've been looking at so far is quite readable [15:51] (spoiler: I cannot into programming. I merely host things. [15:51] have only looked at a small pocket of it though [15:51] ) [15:51] hosting things is the hard part :) [15:52] I find it easy now. [15:54] http://ec2-54-204-142-70.compute-1.amazonaws.com:8080/ [15:55] Leave it for a day, it'll freeze up. [15:55] is the symptom of it freezing up that it just stops responding to http requests? [15:56] Yeah, as people seem to complain [15:56] so it might require people to actually use it i guess [15:56] 50x, but I'm not sure if that's from varnish [15:56] i guess i could use it myself [15:57] just point an instance of tinyback at it [15:58] could that disturb anything, or can they run in isolation from the rest of the infrastructure ok? [15:58] yeah, they run in isolation fine [16:01] WELL, NOW I SLEEP [16:02] 'nite :) [16:02] and thanks for the help [16:06] i wonder if there are some cron jobs that run periodically [16:11] i guess David Triendl (soult) isn't on the scene anymore? [16:12] or he is back doing his studies :) [17:30] GLaDOS: when you wake up again, i'd be interested to know if there's anything running from cron for tinyarchive [19:31] if anyone else wants to point there tinyback at http://ec2-54-204-142-70.compute-1.amazonaws.com:8080 to help me debug it that would be swell [19:31] s/there/their/ [19:58] edsu: i'm pointing at it [19:59] theoretically. see two WARNING messages about ServiceExceptions [20:04] pft: awesome, can you paste? [20:04] are you Levon? [20:04] 2013-11-05 20:57:49,923 tinyback.Reaper WARNING: ServiceException(Unexpected response on status 200) on code b4ont9 [20:04] yes [20:04] 2013-11-05 20:57:51,780 tinyback.Reaper WARNING: ServiceException(HTTP status changed from 200 to 301 on second request) on code b4ont9 [20:04] those are failures on the shortener i assume [20:04] 2013-11-05 21:03:43,322 tinyback.Tracker WARNING: Server refused data for task 571a69c8-29ad-11e3-8225-00224d7a9dd0 [20:04] not sure what that is [20:05] yes, that does look like a shortener error [20:05] but that last one is on the tinyarchive side [20:05] yeah that's what i was suspecting [20:12] looks like the server refused is when the task is PUT back to tinyarchive and tinyback gets a 409 Conflict error [20:15] ahh [20:16] looks like that can happen when the task isn't 'properly assigned' https://github.com/ArchiveTeam/tinyarchive/blob/master/tracker/tracker.py#L244 [20:16] maybe that prevents multiple people from reporting back on the same task? [20:17] seems likely [20:17] i could add some more logging if you keep seeing errors like that [20:17] i can see you've submitted 5 tasks so far ok [20:18] and an anonymous user is doing some as well [20:19] yeah i see some thigns going by [20:20] thanks for kicking the tires, it's funny when you run software and want to see it fail :) [20:21] that's most of my life over here ;) [20:31] and it appears to be out of tasks now [20:33] yes [20:33] any idea how to populate more? [20:33] I have no idea [20:33] * edsu eyes task_create.py [20:33] ahahah [20:34] maybe if you punch some random buttons on that! [20:34] hehe [20:35] i keep meaning to look into more archive team code stuff but with everything going on for me right now is imply haven't got the time [20:37] i know the feeling [20:53] hmm, task_create.py seems to be part of the story [20:53] but not all of it [20:54] :| [20:55] looks like you picked up some more work though? [20:55] it seems to have [20:56] hmm well, ok then :-D [20:57] now it says not asks [20:57] er no tasks [20:57] now it got a task! [20:57] unpredictable [20:57] hmm i guess i need to look at how that works [20:57] i think it's somewhat time dependent [20:57] "Any sufficiently advanced technology is indistinguishable from magic." [20:58] yeah [20:58] probably more logging and more status outputs would be helpful [20:58] there's my random software development statement that's applicable to everything for the day [21:01] it's so true, though [21:02] print considered helpful [21:03] pft: you're catching up http://ec2-54-204-142-70.compute-1.amazonaws.com:8080/ [21:05] i am ON FIRE! [21:05] dude, yes [21:11] Hey, How do I get involved in this? Whats the software requirements? [21:12] how involved do you want to be? you can always download and run the archiveteam warrior VM [21:12] http://www.archiveteam.org/index.php?title=ArchiveTeam_Warrior [21:12] Well, I'm interested in the URLTeam part, I have the server infrastructure to help [21:13] well in terms of fetching-and-submitting the warrior is the easiest way [21:13] Okay, Sorry I have no idea how any of this works :) [21:13] if you want to actually get into running server infrastructure I'm not sure, GLaDOS runs some of that, I believe [21:14] in terms of downloading data from the url shorteners and submitting it to the archive you just need to get the warrior vms running [21:14] Hm, Would I be able to load that into EXSi? [21:15] teh vm is distributed as an .ova file, i believe you can pull that into vmware [21:16] hmm, perhaps not, though http://badcheese.com/~steve/atlogs/?chan=warrior&day=2012-10-13 [21:18] Hmm [21:19] i futzed with trying to get the warrior to work in vmware fusion at one point and gave up and just installed virtualbox [21:19] Yeah, All of my systems are EXSi for simplicity so It would be useful if I could just load a few templates in a few boxes [21:19] But that hope seems to look a little distant [21:20] you could try pulling the .ova file into VMWare and see if it works [21:21] Na [21:21] I tried [21:21] and it said it was not suppored [21:21] ahh :( [21:21] http://i.imgur.com/aDUd3qo.png [21:22] http://tad-do.net/2012/01/30/converting-virtualbox-to-vmware-esxi/ [21:22] hmm [21:22] i haven't used esx in years so i have no idea [21:24] Hm, I think I will look into this later [21:24] roger that [21:24] it would probably help out the project a lot if you cuold figure out how to make the vm load in esx and update the wiki [21:28] i saw you can load a vmware image into ec2 ; but haven't tried w/ the warrior [21:29] Benjojo: when the tracker is working, you can also run a python program to get urlteam tasks and submit them back [21:29] Oh? [21:30] Benjojo: i say this as soeone who has been lurking in here for 1 day ; so please apply buolder of salt [21:30] yeah. that's a little more involved becuase you have to know what's going on with the tracker but it will run in a base level debian install [21:30] boulder [21:30] Ah, Alrighty [21:31] that's essentialy what teh warrior vm is - a debian isntall with some packages installed and the seesaw script [21:31] Benjojo: you see the section on TinyBack here? http://archiveteam.org/?title=URLTeam [21:31] the nice thing about the warrior is that it phones home so if new projects are added or trackers change, it can get that information and update accordingly [21:31] git clone ; ./run.py [21:32] alas the tracker is currently doing this http://urlteam.terrywri.st/ :-| [21:32] pft: i'm assuming, perhaps wrongly, that the warrior talks to that tracker too by default? [21:33] i believe it does [21:33] right now people running the warrior are probably hangingo ut with a screen that says "No item received. Retrying after 30 seconds..." [21:35] i think the blip.tv project is on hold and formspring seems to have 0 pendings so teh warrior is probably idle for most people [21:35] http://tracker.archiveteam.org/ [21:42] pft: is there a way to see stats on what's happening at http://tracker.archiveteam.org/ ? [21:42] or do you have to run the warrior to see the leaderboard, etc? [21:42] or perhaps i'm just not clicking on the right link :) [21:43] oh i see they each have their own leaderboard, duh [21:43] yeah [21:43] you can see stats for particular projects but urlteam works differently [21:43] e.g. http://tracker.archiveteam.org/bloopertv/ [23:23] edsu: do you want me to leave this going?