Time |
Nickname |
Message |
01:15
🔗
|
underscor |
hmmm |
01:16
🔗
|
underscor |
I have a wiki looping infinitely on "XML for Main_Page is incorrect" |
01:16
🔗
|
underscor |
Solution? |
06:50
🔗
|
Nemo_bis |
underscor: ctrl-c |
06:51
🔗
|
underscor |
that's what I've been doing |
06:51
🔗
|
underscor |
but then how do I pipe the output to create a log? |
06:52
🔗
|
Nemo_bis |
underscor: a log of what? |
06:53
🔗
|
underscor |
the log of output |
06:53
🔗
|
underscor |
like http://p.defau.lt/?3qpqOZdAwinhcBFqCAg4vg |
06:53
🔗
|
Nemo_bis |
underscor: interrupting that dump doesn't kill the launcher.py process |
06:54
🔗
|
underscor |
oh, I thought it would kill the > bit |
06:54
🔗
|
underscor |
ok |
18:04
🔗
|
emijrp |
Nemo_bis: do you remember which script we used to discard dead wikis of Andrew Pavlo 22,000 wikis list? |
18:09
🔗
|
emijrp |
Nemo_bis: checkalive.py ok |
18:12
🔗
|
Nemo_bis |
emijrp: you did it all :) |
18:12
🔗
|
Nemo_bis |
emijrp: do you have a new list to run it on? |
18:15
🔗
|
emijrp |
no |
18:37
🔗
|
emijrp |
this failed to download he images, are corrupt http://archive.org/details/wiki-startrekfreedomcom_wiki |
18:51
🔗
|
Nemo_bis |
emijrp: differently than in the Citizendium dump, the list was created |
18:52
🔗
|
Nemo_bis |
oic, images were downloaded but are bogus |
18:54
🔗
|
Nemo_bis |
O_o The requested method POST is not allowed for the URL /wiki/images/9/9d/1238704362854102575papapishu_albatross_2_svg_med.png. |
18:54
🔗
|
Nemo_bis |
mebbe the downloader should check HTTP headers |
19:01
🔗
|
Nemo_bis |
emijrp: I'm downloading those images for real now |
19:02
🔗
|
emijrp |
there is a bug downloading the .desc files |
19:02
🔗
|
emijrp |
im fixing now |
19:03
🔗
|
Nemo_bis |
ah, also .desc? |
19:03
🔗
|
Nemo_bis |
I'm downloading only files |
19:12
🔗
|
emijrp |
how only files? |
19:13
🔗
|
emijrp |
.desc are downloaded with every file |
19:26
🔗
|
Nemo_bis |
emijrp: too lazy tor erun the script right now, just wget'ing everything |
19:26
🔗
|
Nemo_bis |
actually, already did |
19:31
🔗
|
emijrp |
surely, it is not the only dump with corrupt files |
19:32
🔗
|
emijrp |
i check some more and are ok, but corrupt image downloads is an old issue |
19:38
🔗
|
Nemo_bis |
emijrp: do they all give error 405? |
19:38
🔗
|
Nemo_bis |
I could just grep for it |
19:38
🔗
|
emijrp |
i dont know |
19:58
🔗
|
Nemo_bis |
emijrp: there's lots of them, I'll send you a list |
19:59
🔗
|
emijrp |
lots of what? dumps with corrupt imageS? |
19:59
🔗
|
Nemo_bis |
with 405 errors on images |