#wikiteam 2012-04-07,Sat

↑back Search

Time Nickname Message
12:36 🔗 emijrp we are snails
12:37 🔗 emijrp man, wikis are being desstroyed anywhere
12:37 🔗 emijrp go go go
12:37 🔗 emijrp who have good bandwidth to donwload a fuckton of wikis?
13:03 🔗 alard Need help with the tracker, or are you also looking for tracker hosting?
13:03 🔗 emijrp mm
13:03 🔗 emijrp i think we can use a simple method
13:04 🔗 emijrp a script that reading a list file, launch the downloader
13:04 🔗 emijrp tracker is a bit advanced
13:05 🔗 alard Yes, simple is good. Although a tracker may be useful if you're often updating the lists.
13:20 🔗 emijrp calculating a random sample from a 20,000 wikis list, about 50% are dead
13:20 🔗 emijrp list is from 2009
13:21 🔗 emijrp im going to discard those ones, and split the result list in 100 wikis batches
13:22 🔗 emijrp and profit
14:32 🔗 emijrp generating lists...
14:32 🔗 emijrp http://code.google.com/p/wikiteam/wiki/TaskForce
15:16 🔗 emijrp lists generated http://code.google.com/p/wikiteam/source/browse/trunk#trunk%2Fbatchdownload%2Flists
19:38 🔗 Nemo_bis alard, we could use some help with coding
19:38 🔗 alard What kind of help?
19:39 🔗 Nemo_bis alard,https://code.google.com/p/wikiteam/issues/list?can=2&q=&sort=priority&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary
19:40 🔗 Nemo_bis emijrp created http://wikiteam.googlecode.com/svn/trunk/batchdownload/launcher.py
19:40 🔗 Nemo_bis but I suspect it doesn't have any sort of error handling and so on
19:44 🔗 alard Ah, I see. I mostly responded to emijrp's question about trackers, though. I'm not that familiar with wiki software.
19:55 🔗 Nemo_bis Yes, some sort of tracker will be needed later...
20:44 🔗 emijrp man, the first wiki in the list000 has a lot of pages
20:45 🔗 emijrp i hope others are small
20:45 🔗 emijrp and it is hosted in an IP
20:45 🔗 emijrp first list contain a lot of IP domains
21:09 🔗 Nemo_bis emijrp, the problem is that as usual there are lots of errors
21:09 🔗 Nemo_bis and it's hard to keep track of them, the script just goes on and you have to scroll back
21:09 🔗 emijrp save the output
21:10 🔗 Nemo_bis ?
21:12 🔗 emijrp you can redirect the console text to a file
21:12 🔗 emijrp 2> a.txt
21:15 🔗 emijrp what errors do you get?
21:15 🔗 emijrp all urls where alive
21:15 🔗 emijrp were*
22:12 🔗 underscor Should we submit our logs somewhere?
22:12 🔗 underscor I've had a few different errors
22:12 🔗 underscor Also, alard dcmorton ersi ops spread por favor
23:00 🔗 Nemo_bis underscor, what sort of errors?
23:00 🔗 Nemo_bis They're just the usual errors for me, already in the issue tracker.
