Time |
Nickname |
Message |
19:26
🔗
|
ersi |
so if you got any more questions/thoughts, just shoot :-) |
19:26
🔗
|
shaqfu |
re: project RAM use: is it possible for the tracker to pass smaller jobs to low-power instances? |
19:26
🔗
|
ersi |
no such distinction today |
19:27
🔗
|
ersi |
unless we intentionally make "smaller" jobs by knowing they'll probably be low-power tasks |
19:27
🔗
|
shaqfu |
Hm, does the tracker split the whole task into jobs first, or does it do it on a per-request basis? |
19:28
🔗
|
ersi |
one task = one job |
19:28
🔗
|
shaqfu |
I mean, task as in "all of FormSpring" |
19:29
🔗
|
ersi |
Ah. Well, that's entirely a human task currently for what I know |
19:29
🔗
|
ersi |
ie. the awesome person who researches the target, writes the project code and so (Most often, awesomely and much appreciatedly done by alard) |
19:30
🔗
|
shaqfu |
It could be split between normal and low-power jobs on a per-request basis, but that seems like it may require totally restructuring the tracker |
19:31
🔗
|
ersi |
I assume that's mostly a lot of difficult work on estimating what's a "big" and "small" task when breaking up the archival job |
19:31
🔗
|
shaqfu |
It's not just an issue of "here's 1000 urls" vs "here's 100"? |
19:32
🔗
|
ersi |
it's impossible to know how many links each link will link to, how many resources there are per page and such in before hand |
19:32
🔗
|
shaqfu |
Gotcha |
19:32
🔗
|
ersi |
unless you know, crawl them all in before.. but we might as well grab all of it as we go in that case |
19:33
🔗
|
shaqfu |
We could just build the image with loads of swap ;) |
19:33
🔗
|
ersi |
Indeed, and/or make it default to only do one/two tasks at a time |
19:33
🔗
|
ersi |
and people who host the pi's could turn it up if it works |
19:34
🔗
|
ersi |
I'm up for giving it a try without modifying any of the current infra, I got a 512MB RAM based Pi laying about |
19:34
🔗
|
shaqfu |
Same, but mine's running a fairly large job right now |
19:35
🔗
|
shaqfu |
(only in #archiveteam is a 1.56M crawl "fairly large"...) |
19:35
🔗
|
shaqfu |
Would it be tricky to port to ARM? |
19:37
🔗
|
ersi |
only thing we need to do is compile wget(-lua) for what I know |
19:37
🔗
|
shaqfu |
There are .debs for 1.13, but not 1.14 |
19:37
🔗
|
shaqfu |
That's something on my to-to list |
19:37
🔗
|
ersi |
yeah, but we need wget-lua - which is something that alard has cooked up :) |
19:37
🔗
|
ersi |
it's based on wget-1.14 |
19:38
🔗
|
shaqfu |
Getting to 1.14 should be first in line, then, if only as a public good :) |
19:38
🔗
|
ersi |
sure it isn't available in experimental? |
19:41
🔗
|
shaqfu |
Dunno if there's experimental on Raspbian |
19:42
🔗
|
ersi |
ah |
19:48
🔗
|
shaqfu |
Seems like a reasonable first step |
22:23
🔗
|
shaqfu |
wget 1.14 builds out of the box on RAspbian |
22:25
🔗
|
shaqfu |
Suppose I'll keep trying to build wget-lua until it stops yelling about dependencies :) |
22:57
🔗
|
shaqfu |
Hm, error building wget-lua: couldn't find css.c |
22:58
🔗
|
Cameron_D |
I remember getting that error a few times, can't remember what needed to be done to fix it though :( |
22:59
🔗
|
shaqfu |
Everything else seems to be going along very well |
23:00
🔗
|
shaqfu |
Cameron_D: Was it a missing library? |
23:01
🔗
|
Cameron_D |
I think so, it was a strange library, found the solution somewhere deep in Google |
23:04
🔗
|
shaqfu |
Trying it with flex |
23:07
🔗
|
shaqfu |
Hm, nope |