#urlteam 2013-11-05,Tue

↑back Search

Time Nickname Message
12:57 🔗 edsu xmc: i'm a software developer, if there are any tasks that need doing to help get the tracker working
12:57 🔗 edsu xmc: just offering to help, if any is needed :)
13:57 🔗 ersi Well, the whole thing is kind of in need of being rewritten edsu.
13:58 🔗 twrist https://github.com/ArchiveTeam/tinyback go at it
13:59 🔗 twrist er, https://github.com/ArchiveTeam/tinyarchive
14:01 🔗 ersi though patching up whatever keeps breaking the tracker would probably be low-hanging fruit
15:04 🔗 GLaDOS That's the thing; the entire thing is patches.
15:05 🔗 ersi Uh, not quite.
15:05 🔗 GLaDOS many thansk for ops
15:06 🔗 ersi But it is mostly a prototype gone wild. :)
15:06 🔗 GLaDOS Well from how soult described it, it sounded like that
15:06 🔗 ersi It's hacky, sure - but not "patches"
15:06 🔗 GLaDOS Eh, close enough.
15:09 🔗 edsu so i'm still learning my way around ; is the urlteam tracker different from the archiveteam tracker?
15:10 🔗 xmc yes
15:11 🔗 ersi Indeed
15:12 🔗 ersi https://github.com/ArchiveTeam/tinyarchive is the urltream tracker.
15:12 🔗 ersi https://github.com/ArchiveTeam/tinyback is the urlteam "worker"
15:18 🔗 edsu got it
15:19 🔗 edsu i imagine those berkeleydbs are getting quite large? or are they purged periodically and saved off to a file?
15:19 🔗 ersi yeah
15:20 🔗 * edsu is still reading obv :)
15:20 🔗 ersi purged periodically
15:20 🔗 ersi I got some infos that aren't in the source code, that are from soultcer (author)'s from this chat
15:21 🔗 GLaDOS Oh, yeah, they get purged
15:21 🔗 GLaDOS WHEN THE PURGING WORKS
15:21 🔗 edsu :-D
15:21 🔗 edsu is that the main problem?
15:21 🔗 GLaDOS Possibly
15:21 🔗 GLaDOS The tracker either just freezes up or gets killed
15:21 🔗 GLaDOS Literally, just killed.
15:21 🔗 GLaDOS No traceback
15:21 🔗 edsu is it running on a vm somewhere?
15:22 🔗 edsu oom killer killing it because it is using too much memory?
15:22 🔗 GLaDOS Nah, it's on some crappy kimsufi dedi
15:22 🔗 edsu huh, ok
15:22 🔗 GLaDOS And the killing stopped like a week ago, it just freezes
15:27 🔗 edsu looks like it's using web.py is it running as a single process or something else like apache/mod-wsgi?
15:30 🔗 * edsu wonders how big the sqlite db is
15:36 🔗 GLaDOS edsu: well it's launched by running tracker.py, so unless that creates multiple web.py processes..
15:37 🔗 GLaDOS Also, "45M tasks.sqlite"
15:41 🔗 edsu got it, any chance of getting a copy of the db?
15:43 🔗 GLaDOS http://dumpground.archivingyoursh.it/tasks.sqlite
15:44 🔗 _46bit What am I downloading?
15:46 🔗 GLaDOS An sqlite database which is probably corrupted beyond repaid
15:46 🔗 GLaDOS but meh
15:48 🔗 edsu sqlite file opens at least :)
15:48 🔗 edsu sqlite> select count(*) from task;
15:48 🔗 edsu 191000
15:49 🔗 GLaDOS THAT'S THE CAUSE!
15:50 🔗 edsu ?
15:50 🔗 GLaDOS probably
15:50 🔗 GLaDOS hopefully
15:50 🔗 edsu it's my fault, isn't it
15:50 🔗 GLaDOS nah
15:50 🔗 GLaDOS i haven't dived into the internals yet
15:50 🔗 GLaDOS wanted to keep my sanity
15:50 🔗 edsu cause is too many rows?
15:50 🔗 GLaDOS maybe, no idea.
15:51 🔗 edsu the python i've been looking at so far is quite readable
15:51 🔗 GLaDOS (spoiler: I cannot into programming. I merely host things.
15:51 🔗 edsu have only looked at a small pocket of it though
15:51 🔗 GLaDOS )
15:51 🔗 edsu hosting things is the hard part :)
15:52 🔗 GLaDOS I find it easy now.
15:54 🔗 edsu http://ec2-54-204-142-70.compute-1.amazonaws.com:8080/
15:55 🔗 GLaDOS Leave it for a day, it'll freeze up.
15:55 🔗 edsu is the symptom of it freezing up that it just stops responding to http requests?
15:56 🔗 GLaDOS Yeah, as people seem to complain
15:56 🔗 edsu so it might require people to actually use it i guess
15:56 🔗 GLaDOS 50x, but I'm not sure if that's from varnish
15:56 🔗 edsu i guess i could use it myself
15:57 🔗 edsu just point an instance of tinyback at it
15:58 🔗 edsu could that disturb anything, or can they run in isolation from the rest of the infrastructure ok?
15:58 🔗 GLaDOS yeah, they run in isolation fine
16:01 🔗 GLaDOS WELL, NOW I SLEEP
16:02 🔗 edsu 'nite :)
16:02 🔗 edsu and thanks for the help
16:06 🔗 edsu i wonder if there are some cron jobs that run periodically
16:11 🔗 edsu i guess David Triendl (soult) isn't on the scene anymore?
16:12 🔗 edsu or he is back doing his studies :)
17:30 🔗 edsu GLaDOS: when you wake up again, i'd be interested to know if there's anything running from cron for tinyarchive
19:31 🔗 edsu if anyone else wants to point there tinyback at http://ec2-54-204-142-70.compute-1.amazonaws.com:8080 to help me debug it that would be swell
19:31 🔗 edsu s/there/their/
19:58 🔗 pft edsu: i'm pointing at it
19:59 🔗 pft theoretically. see two WARNING messages about ServiceExceptions
20:04 🔗 edsu pft: awesome, can you paste?
20:04 🔗 edsu are you Levon?
20:04 🔗 pft 2013-11-05 20:57:49,923 tinyback.Reaper WARNING: ServiceException(Unexpected response on status 200) on code b4ont9
20:04 🔗 pft yes
20:04 🔗 pft 2013-11-05 20:57:51,780 tinyback.Reaper WARNING: ServiceException(HTTP status changed from 200 to 301 on second request) on code b4ont9
20:04 🔗 pft those are failures on the shortener i assume
20:04 🔗 pft 2013-11-05 21:03:43,322 tinyback.Tracker WARNING: Server refused data for task 571a69c8-29ad-11e3-8225-00224d7a9dd0
20:04 🔗 pft not sure what that is
20:05 🔗 edsu yes, that does look like a shortener error
20:05 🔗 edsu but that last one is on the tinyarchive side
20:05 🔗 pft yeah that's what i was suspecting
20:12 🔗 edsu looks like the server refused is when the task is PUT back to tinyarchive and tinyback gets a 409 Conflict error
20:15 🔗 pft ahh
20:16 🔗 edsu looks like that can happen when the task isn't 'properly assigned' https://github.com/ArchiveTeam/tinyarchive/blob/master/tracker/tracker.py#L244
20:16 🔗 edsu maybe that prevents multiple people from reporting back on the same task?
20:17 🔗 pft seems likely
20:17 🔗 edsu i could add some more logging if you keep seeing errors like that
20:17 🔗 edsu i can see you've submitted 5 tasks so far ok
20:18 🔗 edsu and an anonymous user is doing some as well
20:19 🔗 pft yeah i see some thigns going by
20:20 🔗 edsu thanks for kicking the tires, it's funny when you run software and want to see it fail :)
20:21 🔗 pft that's most of my life over here ;)
20:31 🔗 pft and it appears to be out of tasks now
20:33 🔗 edsu yes
20:33 🔗 edsu any idea how to populate more?
20:33 🔗 pft I have no idea
20:33 🔗 * edsu eyes task_create.py
20:33 🔗 pft ahahah
20:34 🔗 pft maybe if you punch some random buttons on that!
20:34 🔗 edsu hehe
20:35 🔗 pft i keep meaning to look into more archive team code stuff but with everything going on for me right now is imply haven't got the time
20:37 🔗 edsu i know the feeling
20:53 🔗 edsu hmm, task_create.py seems to be part of the story
20:53 🔗 edsu but not all of it
20:54 🔗 pft :|
20:55 🔗 edsu looks like you picked up some more work though?
20:55 🔗 pft it seems to have
20:56 🔗 edsu hmm well, ok then :-D
20:57 🔗 pft now it says not asks
20:57 🔗 pft er no tasks
20:57 🔗 pft now it got a task!
20:57 🔗 pft unpredictable
20:57 🔗 edsu hmm i guess i need to look at how that works
20:57 🔗 edsu i think it's somewhat time dependent
20:57 🔗 edsu "Any sufficiently advanced technology is indistinguishable from magic."
20:58 🔗 pft yeah
20:58 🔗 pft probably more logging and more status outputs would be helpful
20:58 🔗 pft there's my random software development statement that's applicable to everything for the day
21:01 🔗 edsu it's so true, though
21:02 🔗 edsu print considered helpful
21:03 🔗 edsu pft: you're catching up http://ec2-54-204-142-70.compute-1.amazonaws.com:8080/
21:05 🔗 pft i am ON FIRE!
21:05 🔗 edsu dude, yes
21:11 🔗 Benjojo Hey, How do I get involved in this? Whats the software requirements?
21:12 🔗 pft how involved do you want to be? you can always download and run the archiveteam warrior VM
21:12 🔗 pft http://www.archiveteam.org/index.php?title=ArchiveTeam_Warrior
21:12 🔗 Benjojo Well, I'm interested in the URLTeam part, I have the server infrastructure to help
21:13 🔗 pft well in terms of fetching-and-submitting the warrior is the easiest way
21:13 🔗 Benjojo Okay, Sorry I have no idea how any of this works :)
21:13 🔗 pft if you want to actually get into running server infrastructure I'm not sure, GLaDOS runs some of that, I believe
21:14 🔗 pft in terms of downloading data from the url shorteners and submitting it to the archive you just need to get the warrior vms running
21:14 🔗 Benjojo Hm, Would I be able to load that into EXSi?
21:15 🔗 pft teh vm is distributed as an .ova file, i believe you can pull that into vmware
21:16 🔗 pft hmm, perhaps not, though http://badcheese.com/~steve/atlogs/?chan=warrior&day=2012-10-13
21:18 🔗 Benjojo Hmm
21:19 🔗 pft i futzed with trying to get the warrior to work in vmware fusion at one point and gave up and just installed virtualbox
21:19 🔗 Benjojo Yeah, All of my systems are EXSi for simplicity so It would be useful if I could just load a few templates in a few boxes
21:19 🔗 Benjojo But that hope seems to look a little distant
21:20 🔗 pft you could try pulling the .ova file into VMWare and see if it works
21:21 🔗 Benjojo Na
21:21 🔗 Benjojo I tried
21:21 🔗 Benjojo and it said it was not suppored
21:21 🔗 pft ahh :(
21:21 🔗 Benjojo http://i.imgur.com/aDUd3qo.png
21:22 🔗 pft http://tad-do.net/2012/01/30/converting-virtualbox-to-vmware-esxi/
21:22 🔗 pft hmm
21:22 🔗 pft i haven't used esx in years so i have no idea
21:24 🔗 Benjojo Hm, I think I will look into this later
21:24 🔗 pft roger that
21:24 🔗 pft it would probably help out the project a lot if you cuold figure out how to make the vm load in esx and update the wiki
21:28 🔗 edsu i saw you can load a vmware image into ec2 ; but haven't tried w/ the warrior
21:29 🔗 edsu Benjojo: when the tracker is working, you can also run a python program to get urlteam tasks and submit them back
21:29 🔗 Benjojo Oh?
21:30 🔗 edsu Benjojo: i say this as soeone who has been lurking in here for 1 day ; so please apply buolder of salt
21:30 🔗 pft yeah. that's a little more involved becuase you have to know what's going on with the tracker but it will run in a base level debian install
21:30 🔗 edsu boulder
21:30 🔗 Benjojo Ah, Alrighty
21:31 🔗 pft that's essentialy what teh warrior vm is - a debian isntall with some packages installed and the seesaw script
21:31 🔗 edsu Benjojo: you see the section on TinyBack here? http://archiveteam.org/?title=URLTeam
21:31 🔗 pft the nice thing about the warrior is that it phones home so if new projects are added or trackers change, it can get that information and update accordingly
21:31 🔗 edsu git clone ; ./run.py
21:32 🔗 edsu alas the tracker is currently doing this http://urlteam.terrywri.st/ :-|
21:32 🔗 edsu pft: i'm assuming, perhaps wrongly, that the warrior talks to that tracker too by default?
21:33 🔗 pft i believe it does
21:33 🔗 pft right now people running the warrior are probably hangingo ut with a screen that says "No item received. Retrying after 30 seconds..."
21:35 🔗 pft i think the blip.tv project is on hold and formspring seems to have 0 pendings so teh warrior is probably idle for most people
21:35 🔗 pft http://tracker.archiveteam.org/
21:42 🔗 edsu pft: is there a way to see stats on what's happening at http://tracker.archiveteam.org/ ?
21:42 🔗 edsu or do you have to run the warrior to see the leaderboard, etc?
21:42 🔗 edsu or perhaps i'm just not clicking on the right link :)
21:43 🔗 edsu oh i see they each have their own leaderboard, duh
21:43 🔗 pft yeah
21:43 🔗 pft you can see stats for particular projects but urlteam works differently
21:43 🔗 edsu e.g. http://tracker.archiveteam.org/bloopertv/
23:23 🔗 pft edsu: do you want me to leave this going?

irclogger-viewer