| Time |
Nickname |
Message |
|
04:47
🔗
|
SketchCow |
So, the warrior is the greatest thing ever |
|
04:58
🔗
|
underscor |
^ |
|
05:02
🔗
|
BlueMax |
lol |
|
05:02
🔗
|
BlueMax |
surprised we didn't think of this earlier |
|
05:16
🔗
|
SketchCow |
We did! |
|
05:16
🔗
|
SketchCow |
I announced the idea a while ago |
|
05:16
🔗
|
SketchCow |
And pre-ideas existed |
|
05:16
🔗
|
SketchCow |
But now it's just an unstoppable juggernaut |
|
05:16
🔗
|
SketchCow |
The real puzzler now is coming up with tasks other than 'download this website' |
|
05:38
🔗
|
bsmith094 |
keeping up with fanfiction.net scrape, and maybe fictionpress as well |
|
05:48
🔗
|
underscor |
I mean, there's things like the indexer job |
|
05:48
🔗
|
underscor |
but yeah, would be nice to always keep them busy |
|
07:24
🔗
|
SmileyG |
Can it do the small time compression / splitting files stuff? |
|
07:24
🔗
|
SmileyG |
or are they too large to sensibly attempt? |
|
07:27
🔗
|
SmileyG |
creating metadata for anything? (I don't know the structure of the data within the archives..) |
|
13:56
🔗
|
underscor |
SmileyG: Alard had them do that for the tars |
|
13:56
🔗
|
underscor |
but I'm trying to think of more jobs |
|
13:58
🔗
|
SmileyG |
tbh I don't know much of the process... so I can't really help :( |
|
13:58
🔗
|
SketchCow |
We'll just have to keep thinking. |
|
13:58
🔗
|
SmileyG |
preemp backups of the "sites to be watched" ? |
|
13:59
🔗
|
SketchCow |
The problem just isn't processing for determining the contents of items on archive. |
|
17:34
🔗
|
yipdw |
if someone wants to adapt the fanfiction.net scraper I wrote for seesaw, that'd be nice |
|
17:34
🔗
|
yipdw |
they're both in Python, so it's not a language translation |
|
17:34
🔗
|
yipdw |
you'll also need a tracker, which can get a bit expensive to run, especially if you want to do continuous backup |
|
17:35
🔗
|
yipdw |
not sure if the AT shared tracker is up to it |
|
18:28
🔗
|
ersi |
Is the source up on the AT github space? |
|
18:32
🔗
|
yipdw |
yes, ffnet-grab |
|
18:33
🔗
|
yipdw |
I make no claims for it being *good* Python, but I like to think it is at least straightforward |
|
18:33
🔗
|
yipdw |
heh |
|
18:47
🔗
|
alard |
yipdw: What does the tracker need to do that the shared tracker can't do? |
|
19:09
🔗
|
yipdw |
alard: nothing special, it's just a lot of data |
|
19:11
🔗
|
yipdw |
alard: more specifically, frequent updates -- ff.net has 2 million or so users; each user is a work item, and I've noticed that that puts a lot of stress on the tracker when e.g. generating progress graphs |
|
19:11
🔗
|
yipdw |
I did implement a quick-and-dirty trimming method in a private copy of the tracker, but the one I implemented discarded history instead of making past history less granular |
|
19:12
🔗
|
yipdw |
I'm not sure if you or someone else has fixed that yet; still reviewing universal-tracker history |
|
19:32
🔗
|
alard |
Ah, yes, no, that's still the same. |
|
19:35
🔗
|
alard |
Although it's mainly the stats page that becomes slow; the tracker still works. |
|
19:40
🔗
|
yipdw |
ah, ok |
|
19:40
🔗
|
yipdw |
right |