Time |
Nickname |
Message |
12:36
🔗
|
emijrp |
we are snails |
12:37
🔗
|
emijrp |
man, wikis are being desstroyed anywhere |
12:37
🔗
|
emijrp |
go go go |
12:37
🔗
|
emijrp |
who have good bandwidth to donwload a fuckton of wikis? |
13:03
🔗
|
alard |
Need help with the tracker, or are you also looking for tracker hosting? |
13:03
🔗
|
emijrp |
mm |
13:03
🔗
|
emijrp |
i think we can use a simple method |
13:04
🔗
|
emijrp |
a script that reading a list file, launch the downloader |
13:04
🔗
|
emijrp |
tracker is a bit advanced |
13:05
🔗
|
alard |
Yes, simple is good. Although a tracker may be useful if you're often updating the lists. |
13:20
🔗
|
emijrp |
calculating a random sample from a 20,000 wikis list, about 50% are dead |
13:20
🔗
|
emijrp |
list is from 2009 |
13:21
🔗
|
emijrp |
im going to discard those ones, and split the result list in 100 wikis batches |
13:22
🔗
|
emijrp |
and profit |
14:32
🔗
|
emijrp |
generating lists... |
14:32
🔗
|
emijrp |
http://code.google.com/p/wikiteam/wiki/TaskForce |
15:16
🔗
|
emijrp |
lists generated http://code.google.com/p/wikiteam/source/browse/trunk#trunk%2Fbatchdownload%2Flists |
19:38
🔗
|
Nemo_bis |
alard, we could use some help with coding |
19:38
🔗
|
alard |
What kind of help? |
19:39
🔗
|
Nemo_bis |
alard,https://code.google.com/p/wikiteam/issues/list?can=2&q=&sort=priority&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary |
19:40
🔗
|
Nemo_bis |
emijrp created http://wikiteam.googlecode.com/svn/trunk/batchdownload/launcher.py |
19:40
🔗
|
Nemo_bis |
but I suspect it doesn't have any sort of error handling and so on |
19:44
🔗
|
alard |
Ah, I see. I mostly responded to emijrp's question about trackers, though. I'm not that familiar with wiki software. |
19:55
🔗
|
Nemo_bis |
Yes, some sort of tracker will be needed later... |
20:44
🔗
|
emijrp |
man, the first wiki in the list000 has a lot of pages |
20:45
🔗
|
emijrp |
i hope others are small |
20:45
🔗
|
emijrp |
and it is hosted in an IP |
20:45
🔗
|
emijrp |
first list contain a lot of IP domains |
21:09
🔗
|
Nemo_bis |
emijrp, the problem is that as usual there are lots of errors |
21:09
🔗
|
Nemo_bis |
and it's hard to keep track of them, the script just goes on and you have to scroll back |
21:09
🔗
|
emijrp |
save the output |
21:10
🔗
|
Nemo_bis |
? |
21:12
🔗
|
emijrp |
you can redirect the console text to a file |
21:12
🔗
|
emijrp |
2> a.txt |
21:15
🔗
|
emijrp |
what errors do you get? |
21:15
🔗
|
emijrp |
all urls where alive |
21:15
🔗
|
emijrp |
were* |
22:12
🔗
|
underscor |
Should we submit our logs somewhere? |
22:12
🔗
|
underscor |
I've had a few different errors |
22:12
🔗
|
underscor |
Also, alard dcmorton ersi ops spread por favor |
23:00
🔗
|
Nemo_bis |
underscor, what sort of errors? |
23:00
🔗
|
Nemo_bis |
They're just the usual errors for me, already in the issue tracker. |