Time |
Nickname |
Message |
04:40
🔗
|
underscor |
Nemo_bis: I've gotten a few different ones |
04:40
🔗
|
underscor |
Error in api.php, please, provide a correct path to api.php |
04:41
🔗
|
underscor |
Error in api.php, please, provide a correct path to api.php |
04:41
🔗
|
underscor |
er |
04:41
🔗
|
underscor |
DO NOT USE THIS SCRIPT TO DOWNLOAD WIKIMEDIA PROJECTS! |
04:41
🔗
|
underscor |
I guess those are the only two I've gotten |
04:42
🔗
|
underscor |
oh |
04:42
🔗
|
underscor |
the one that generated that capital letters thing |
04:42
🔗
|
underscor |
http://ca.wikinews.org/w/api.php |
04:42
🔗
|
underscor |
makes sense :P |
07:45
🔗
|
Nemo_bis |
underscor, the lists contain lots of non functioning wikis, that's the point |
07:46
🔗
|
Nemo_bis |
some have nasty errors such as https://code.google.com/p/wikiteam/issues/detail?id=48 |
07:47
🔗
|
Nemo_bis |
underscor, the errors you should really pay attention to are like https://code.google.com/p/wikiteam/issues/detail?id=47 and https://code.google.com/p/wikiteam/issues/detail?id=46 |
07:48
🔗
|
Nemo_bis |
btw the script is not doing anything in the whole grep and 7z part here |
08:16
🔗
|
Schbirid |
dumpgenerator seems to download duplicate images |
08:16
🔗
|
Schbirid |
/ not notice if images are duplicates |
08:17
🔗
|
Schbirid |
in the -images.txt file i got some lines duplicated almost 300 times |
08:17
🔗
|
Schbirid |
for wikibeyondunrealcom |
08:18
🔗
|
Schbirid |
can i uniq that file and --resume? |
08:19
🔗
|
Schbirid |
http://de.publicdomainproject.org/api.php is giving me "Error in api.php, please, provide a correct path to api.php" |
10:17
🔗
|
emijrp |
sorry about the last bugs |
10:17
🔗
|
emijrp |
i have fixed them in the last hours |
10:17
🔗
|
emijrp |
i have explained in the mailing list |
10:17
🔗
|
emijrp |
update your launcher.py |
10:17
🔗
|
emijrp |
and relaunch... |
10:18
🔗
|
emijrp |
or delete all your downloaded dumps and restarts, but this point is not needed |
10:18
🔗
|
emijrp |
only if you are paranoid |
10:21
🔗
|
emijrp |
this is the way bugs are discovered, TESTING A MAKING STUFF |
10:29
🔗
|
Schbirid |
cheers! |
10:33
🔗
|
emijrp |
Schbirid: if you are in the task force, add you http://code.google.com/p/wikiteam/wiki/TaskForce |
10:34
🔗
|
Schbirid |
nah, just randomly using it to grab wikis i like. i use it with --images so not sure if you guys could need them |
10:34
🔗
|
Schbirid |
http://code.google.com/p/wikiteam/wiki/TaskForce |
10:35
🔗
|
emijrp |
we use --images always |
10:35
🔗
|
Schbirid |
oh wicked |
19:12
🔗
|
Nemo_bis |
sigh, so hard to dig the launcher.py's logs |
19:14
🔗
|
Nemo_bis |
emijrp, does the new launcher.py resume also incomplete dumps? |
19:14
🔗
|
emijrp |
yes |
19:14
🔗
|
Nemo_bis |
like, if not all images have been downloaded or the XML is not complete |
19:14
🔗
|
Nemo_bis |
oh, nice |
19:15
🔗
|
emijrp |
but if the 7z was generate from an incomplete dump, remove it |
19:15
🔗
|
emijrp |
generated* |
19:15
🔗
|
Nemo_bis |
emijrp, how does it do so, looks for special:version? |
19:15
🔗
|
Nemo_bis |
no risk of that because 7z wasn't run :) |
19:16
🔗
|
emijrp |
it checks in the .xml ends in </mediawiki>, and the last image file is the last image in -images.txt file |
19:17
🔗
|
Nemo_bis |
oh ok |
19:21
🔗
|
emijrp |
just give a try, and tell me |
19:26
🔗
|
Nemo_bis |
WARNINGS for files: |
19:26
🔗
|
Nemo_bis |
cook_2bionyuedu_cgsb-history.xml : No more files |
19:26
🔗
|
Nemo_bis |
cook_2bionyuedu_cgsb-titles.txt : No more files |
19:26
🔗
|
Nemo_bis |
emijrp, what's this? |
19:26
🔗
|
Nemo_bis |
---------------- |
19:26
🔗
|
Nemo_bis |
cook_2bionyuedu_cgsb-images.txt : No more files |
19:26
🔗
|
Nemo_bis |
WARNING: Cannot find 3 files |
19:27
🔗
|
Nemo_bis |
that dump seems complete |
19:31
🔗
|
Nemo_bis |
emijrp, the -history, -titles and -images files are not added to the archive although they're there |
19:31
🔗
|
Nemo_bis |
the same for all archives created so far (3, all cook* :p) |
19:32
🔗
|
Nemo_bis |
emijrp, you forgot the timestamp in the filename |
19:33
🔗
|
Nemo_bis |
cook_2bionyuedu_cgsb-20120408-history.xml etc. |
19:36
🔗
|
emijrp |
have you downloaded the last version of launcher.py ? |
19:36
🔗
|
emijrp |
r516 (6 hours ago) |
19:40
🔗
|
emijrp |
yes, it is my fault |
19:41
🔗
|
emijrp |
looks like another bug |
19:42
🔗
|
emijrp |
code updated |
19:42
🔗
|
emijrp |
i hope it WORKS now |
19:42
🔗
|
Nemo_bis |
ok thanks |
19:42
🔗
|
Nemo_bis |
heh |
19:42
🔗
|
emijrp |
remove all .7z |
19:42
🔗
|
Nemo_bis |
sure |
19:43
🔗
|
Nemo_bis |
I love debugging, but fixing bugs is less fun :p |
19:43
🔗
|
emijrp |
this launcher is annoying me |
19:44
🔗
|
emijrp |
by they way, are your wikis big? |
19:44
🔗
|
emijrp |
mine are huge |
19:45
🔗
|
emijrp |
i have the worst luck ever |
19:49
🔗
|
Nemo_bis |
I have at least three with 100k pages I think |
19:50
🔗
|
emijrp |
People love to write in the cloud. |
19:50
🔗
|
Nemo_bis |
errors.log: WARNING: No more files <-- I guess this is actually good news |
19:50
🔗
|
emijrp |
yes |
19:50
🔗
|
Nemo_bis |
or to import pages from Wikipedia |
19:51
🔗
|
Nemo_bis |
$ wc -l cn18daonet-20120408-titles.txt |
19:51
🔗
|
Nemo_bis |
350555 cn18daonet-20120408-titles.txt |
19:51
🔗
|
Nemo_bis |
this name sounds familiar |
19:51
🔗
|
emijrp |
no for me |
19:54
🔗
|
Nemo_bis |
I think I previously failed to download this wiki |
19:55
🔗
|
Nemo_bis |
oh, emijrp, if you're annoyed by big wikis fix https://code.google.com/p/wikiteam/issues/detail?id=44 so that we can download them faster |
19:55
🔗
|
Nemo_bis |
:p |
19:57
🔗
|
Nemo_bis |
in particular #22 and perhaps https://code.google.com/p/wikiteam/issues/detail?id=18 which is probably best fixed via API too |
20:01
🔗
|
emijrp |
yes |
20:01
🔗
|
emijrp |
but i dont want to fix one of that bugs and break all |
20:03
🔗
|
emijrp |
i would like people make changes too |
20:03
🔗
|
emijrp |
perhaps, he can start to documentate the code |
20:03
🔗
|
emijrp |
and when he understands most of it, make patches |
20:04
🔗
|
Nemo_bis |
yes but I don't know who to ask |
20:04
🔗
|
Nemo_bis |
did you mean that *you* could document the code? :) |
20:04
🔗
|
emijrp |
i can code the hard sections |
20:04
🔗
|
Nemo_bis |
yep |
20:05
🔗
|
emijrp |
sorry |
20:05
🔗
|
emijrp |
i can document the hard sections |
20:05
🔗
|
Nemo_bis |
yeah, got it |
20:05
🔗
|
emijrp |
the rest for you all |
20:05
🔗
|
Nemo_bis |
well, learning python is not one of my first priorities |
20:06
🔗
|
emijrp |
and then, when 1 or 2 people studied the code while making documentation, they can start to make patches |
20:06
🔗
|
emijrp |
i speak about all the members, not just you |
20:15
🔗
|
Nemo_bis |
we should probably ask to PWB devs first, but I know none |
20:20
🔗
|
emijrp |
pwb? |
21:50
🔗
|
Nemo_bis |
pywikipediabot... |