[04:40] Nemo_bis: I've gotten a few different ones [04:40] Error in api.php, please, provide a correct path to api.php [04:41] Error in api.php, please, provide a correct path to api.php [04:41] er [04:41] DO NOT USE THIS SCRIPT TO DOWNLOAD WIKIMEDIA PROJECTS! [04:41] I guess those are the only two I've gotten [04:42] oh [04:42] the one that generated that capital letters thing [04:42] http://ca.wikinews.org/w/api.php [04:42] makes sense :P [07:45] underscor, the lists contain lots of non functioning wikis, that's the point [07:46] some have nasty errors such as https://code.google.com/p/wikiteam/issues/detail?id=48 [07:47] underscor, the errors you should really pay attention to are like https://code.google.com/p/wikiteam/issues/detail?id=47 and https://code.google.com/p/wikiteam/issues/detail?id=46 [07:48] btw the script is not doing anything in the whole grep and 7z part here [08:16] dumpgenerator seems to download duplicate images [08:16] / not notice if images are duplicates [08:17] in the -images.txt file i got some lines duplicated almost 300 times [08:17] for wikibeyondunrealcom [08:18] can i uniq that file and --resume? [08:19] http://de.publicdomainproject.org/api.php is giving me "Error in api.php, please, provide a correct path to api.php" [10:17] sorry about the last bugs [10:17] i have fixed them in the last hours [10:17] i have explained in the mailing list [10:17] update your launcher.py [10:17] and relaunch... [10:18] or delete all your downloaded dumps and restarts, but this point is not needed [10:18] only if you are paranoid [10:21] this is the way bugs are discovered, TESTING A MAKING STUFF [10:29] cheers! [10:33] Schbirid: if you are in the task force, add you http://code.google.com/p/wikiteam/wiki/TaskForce [10:34] nah, just randomly using it to grab wikis i like. i use it with --images so not sure if you guys could need them [10:34] http://code.google.com/p/wikiteam/wiki/TaskForce [10:35] we use --images always [10:35] oh wicked [19:12] sigh, so hard to dig the launcher.py's logs [19:14] emijrp, does the new launcher.py resume also incomplete dumps? [19:14] yes [19:14] like, if not all images have been downloaded or the XML is not complete [19:14] oh, nice [19:15] but if the 7z was generate from an incomplete dump, remove it [19:15] generated* [19:15] emijrp, how does it do so, looks for special:version? [19:15] no risk of that because 7z wasn't run :) [19:16] it checks in the .xml ends in , and the last image file is the last image in -images.txt file [19:17] oh ok [19:21] just give a try, and tell me [19:26] WARNINGS for files: [19:26] cook_2bionyuedu_cgsb-history.xml : No more files [19:26] cook_2bionyuedu_cgsb-titles.txt : No more files [19:26] emijrp, what's this? [19:26] ---------------- [19:26] cook_2bionyuedu_cgsb-images.txt : No more files [19:26] WARNING: Cannot find 3 files [19:27] that dump seems complete [19:31] emijrp, the -history, -titles and -images files are not added to the archive although they're there [19:31] the same for all archives created so far (3, all cook* :p) [19:32] emijrp, you forgot the timestamp in the filename [19:33] cook_2bionyuedu_cgsb-20120408-history.xml etc. [19:36] have you downloaded the last version of launcher.py ? [19:36] r516 (6 hours ago) [19:40] yes, it is my fault [19:41] looks like another bug [19:42] code updated [19:42] i hope it WORKS now [19:42] ok thanks [19:42] heh [19:42] remove all .7z [19:42] sure [19:43] I love debugging, but fixing bugs is less fun :p [19:43] this launcher is annoying me [19:44] by they way, are your wikis big? [19:44] mine are huge [19:45] i have the worst luck ever [19:49] I have at least three with 100k pages I think [19:50] People love to write in the cloud. [19:50] errors.log: WARNING: No more files <-- I guess this is actually good news [19:50] yes [19:50] or to import pages from Wikipedia [19:51] $ wc -l cn18daonet-20120408-titles.txt [19:51] 350555 cn18daonet-20120408-titles.txt [19:51] this name sounds familiar [19:51] no for me [19:54] I think I previously failed to download this wiki [19:55] oh, emijrp, if you're annoyed by big wikis fix https://code.google.com/p/wikiteam/issues/detail?id=44 so that we can download them faster [19:55] :p [19:57] in particular #22 and perhaps https://code.google.com/p/wikiteam/issues/detail?id=18 which is probably best fixed via API too [20:01] yes [20:01] but i dont want to fix one of that bugs and break all [20:03] i would like people make changes too [20:03] perhaps, he can start to documentate the code [20:03] and when he understands most of it, make patches [20:04] yes but I don't know who to ask [20:04] did you mean that *you* could document the code? :) [20:04] i can code the hard sections [20:04] yep [20:05] sorry [20:05] i can document the hard sections [20:05] yeah, got it [20:05] the rest for you all [20:05] well, learning python is not one of my first priorities [20:06] and then, when 1 or 2 people studied the code while making documentation, they can start to make patches [20:06] i speak about all the members, not just you [20:15] we should probably ask to PWB devs first, but I know none [20:20] pwb? [21:50] pywikipediabot...