Time |
Nickname |
Message |
00:12
🔗
|
nico_32 |
02:12:28 [archiveteam@wopr:~] $ python ./wikiteam/launcher.py wikidystifycom_for_jaa |
00:13
🔗
|
nico_32 |
ERROR: The wiki returned status code HTTP 406 |
00:13
🔗
|
nico_32 |
:( |
00:15
🔗
|
|
Wingy has quit IRC (Read error: Operation timed out) |
00:17
🔗
|
JAA |
Oof. I guess that means the API is disabled or something like that? |
00:19
🔗
|
|
Wingy has joined #wikiteam |
00:20
🔗
|
|
systwi_ has joined #wikiteam |
00:23
🔗
|
|
systwi has quit IRC (Read error: Operation timed out) |
00:24
🔗
|
nico_32 |
<head><title>Not Acceptable!</title></head><body><h1>Not Acceptable!</h1><p>An appropriate representation of the requested resource could not be found on this server. This error was generated by Mod_Security.</p></body></html> |
00:24
🔗
|
nico_32 |
dumpgenerator.py trigger mod_security |
00:24
🔗
|
nico_32 |
i don't know why |
00:24
🔗
|
nico_32 |
if i try |
00:24
🔗
|
nico_32 |
curl -v --data "title=Special:Version" -X POST https://wiki.dystify.com/index.php |
00:24
🔗
|
nico_32 |
it works |
00:26
🔗
|
JAA |
Hmm, something related to the user agent perhaps? |
00:26
🔗
|
JAA |
I see that dumpgenerator.py uses a Firefox UA. Maybe that's blocked on the API. |
00:29
🔗
|
nico_32 |
yes it works with 'curl/7.66.0' |
00:29
🔗
|
nico_32 |
don't ask me why |
00:31
🔗
|
JAA |
Computers, how do they even work? |
00:31
🔗
|
nico_32 |
Mozilla/5.0 (Windows NT 10.0; rv:78.0) Gecko/20100101 Firefox/78.0 |
00:31
🔗
|
nico_32 |
works for UA |
00:32
🔗
|
nico_32 |
Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0 |
00:32
🔗
|
nico_32 |
don't |
00:32
🔗
|
JAA |
Huh |
00:32
🔗
|
nico_32 |
they have something against Firefox on X11 |
00:34
🔗
|
nico_32 |
will be availabe on http://109.190.103.130/wiki_dump/ |
00:34
🔗
|
nico_32 |
since i will not have the right to upload to the current item on IA |
00:35
🔗
|
JAA |
Yeah, no idea who wiki_archiver is either. |
00:35
🔗
|
JAA |
Thanks! |
00:36
🔗
|
* |
nico_32 requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='wiki.dystify.com', port=443): Read timed out. (read timeout=10) |
00:36
🔗
|
nico_32 |
No </mediawiki> tag found: dump failed, needs fixing; resume didn't work. Exiting. |
00:36
🔗
|
nico_32 |
rha |
00:37
🔗
|
nico_32 |
need some sleep i believe |
00:44
🔗
|
nico_32 |
restarted with a delay set at 60 secondes and 10 retries |
00:45
🔗
|
nico_32 |
will see tomorrow what happens |
00:45
🔗
|
nico_32 |
but i believe this will be a dump to monitor |
00:51
🔗
|
nico_32 |
https://ia802804.us.archive.org/view_archive.php?archive=/15/items/wiki-wikidystifycom/wikidystifycom-20200209-wikidump.7z&file=images%2FZelda%2016.png |
00:51
🔗
|
nico_32 |
the current IA dump is missing content |
00:51
🔗
|
nico_32 |
probably all images were not downloaded |
00:53
🔗
|
nico_32 |
anyway |
00:54
🔗
|
* |
nico_32 is currently using Mozilla/5.0 (Windows NT 3.51) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36 as useragent |
06:16
🔗
|
Nemo_bis |
Thanks, let us know how it goes. :) |
06:17
🔗
|
Nemo_bis |
HTTP 406 is sometimes webservers configured to reject certain requests over GET or POST |
06:17
🔗
|
Nemo_bis |
We try to guess which one to use but only in a very simplistic way |
06:44
🔗
|
|
atphoenix has joined #wikiteam |
06:57
🔗
|
|
atphoenix has quit IRC (Read error: Connection reset by peer) |
07:00
🔗
|
|
atphoenix has joined #wikiteam |
07:07
🔗
|
|
systwi_ is now known as systwi |
07:42
🔗
|
|
atphoenix has quit IRC (Read error: Connection reset by peer) |
08:32
🔗
|
|
atphoenix has joined #wikiteam |
08:49
🔗
|
|
atphoenix has quit IRC (Read error: Operation timed out) |
08:59
🔗
|
|
atphoenix has joined #wikiteam |
09:15
🔗
|
|
robogoat has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
kiska1825 has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
benjinss has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
TC01 has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
kiska has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
jodizzle has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
JAA has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
ripdog has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
Igloo has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
revi has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
myself has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
Ryz has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
legoktm has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
Datechnom has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
atphoenix has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
systwi has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
Wingy has quit IRC (se.hub ircd.choopa.net) |
09:15
🔗
|
|
paul2520 has quit IRC (se.hub ircd.choopa.net) |
09:19
🔗
|
|
atphoenix has joined #wikiteam |
09:19
🔗
|
|
systwi has joined #wikiteam |
09:19
🔗
|
|
Wingy has joined #wikiteam |
09:19
🔗
|
|
Ryz has joined #wikiteam |
09:19
🔗
|
|
kiska1825 has joined #wikiteam |
09:19
🔗
|
|
benjinss has joined #wikiteam |
09:19
🔗
|
|
legoktm has joined #wikiteam |
09:19
🔗
|
|
Igloo has joined #wikiteam |
09:19
🔗
|
|
revi has joined #wikiteam |
09:19
🔗
|
|
myself has joined #wikiteam |
09:19
🔗
|
|
TC01 has joined #wikiteam |
09:19
🔗
|
|
paul2520 has joined #wikiteam |
09:19
🔗
|
|
robogoat has joined #wikiteam |
09:19
🔗
|
|
Datechnom has joined #wikiteam |
09:19
🔗
|
|
kiska has joined #wikiteam |
09:19
🔗
|
|
JAA has joined #wikiteam |
09:19
🔗
|
|
jodizzle has joined #wikiteam |
09:19
🔗
|
|
ripdog has joined #wikiteam |
09:19
🔗
|
|
ny.us.hub sets mode: +o JAA |
09:19
🔗
|
|
AlsoJAA sets mode: +o JAA |
09:19
🔗
|
|
JAA sets mode: +o AlsoJAA |
09:20
🔗
|
|
phuzion has quit IRC (se.hub irc.efnet.nl) |
09:20
🔗
|
|
pikami has quit IRC (se.hub irc.efnet.nl) |
09:20
🔗
|
|
SPF|Cloud has quit IRC (se.hub irc.efnet.nl) |
09:20
🔗
|
|
BnAboyZ has quit IRC (se.hub irc.efnet.nl) |
09:20
🔗
|
|
nico_32 has quit IRC (se.hub irc.efnet.nl) |
09:20
🔗
|
|
chfoo has quit IRC (se.hub irc.efnet.nl) |
09:42
🔗
|
|
nico_32 has joined #wikiteam |
09:42
🔗
|
|
phuzion has joined #wikiteam |
09:42
🔗
|
|
pikami has joined #wikiteam |
09:42
🔗
|
|
SPF|Cloud has joined #wikiteam |
09:42
🔗
|
|
BnAboyZ has joined #wikiteam |
09:42
🔗
|
|
chfoo has joined #wikiteam |
10:08
🔗
|
|
Ryz has quit IRC (Quit: Ping timeout (120 seconds)) |
10:08
🔗
|
|
kiska1825 has quit IRC (Ping timeout (120 seconds)) |
10:35
🔗
|
|
atphoenix has quit IRC (Read error: Operation timed out) |
12:58
🔗
|
|
randomd has joined #wikiteam |
13:03
🔗
|
|
randomdes has quit IRC (se.hub irc.underworld.no) |
13:03
🔗
|
|
Nemo_bis has quit IRC (se.hub irc.underworld.no) |
15:39
🔗
|
|
Nemo_bis has joined #wikiteam |
15:40
🔗
|
|
Ryz has joined #wikiteam |
16:49
🔗
|
|
Datechnom has quit IRC (Quit: Ping timeout (120 seconds)) |
16:50
🔗
|
|
Datechnom has joined #wikiteam |
19:20
🔗
|
|
Nemo_bis has quit IRC (se.hub irc.underworld.no) |
19:30
🔗
|
|
robogoat has quit IRC (hub.efnet.us irc.Prison.NET) |
19:37
🔗
|
|
atphoenix has joined #wikiteam |
19:51
🔗
|
|
robogoat has joined #wikiteam |
19:54
🔗
|
|
Nemo_bis has joined #wikiteam |
21:05
🔗
|
nico_32 |
still working on https://wiki.dystify.com, grabbing images |
21:55
🔗
|
nico_32 |
# HACK HACK for wiki.dystify.com |
21:55
🔗
|
nico_32 |
url = url.replace('http:','https:') |
21:55
🔗
|
nico_32 |
broken HTTP to HTTPs redirect |
22:08
🔗
|
|
AlsoJAA_ has joined #wikiteam |
22:08
🔗
|
|
JAA sets mode: +o AlsoJAA_ |
22:09
🔗
|
nico_32 |
and getXMLFileDesc is losing the final </mediawiki> tag in Special:Export |
22:09
🔗
|
nico_32 |
this one is strange |
22:09
🔗
|
nico_32 |
i hacked the code to match on </page> |
22:11
🔗
|
|
second_ has joined #wikiteam |
22:12
🔗
|
nico_32 |
yield xml.split("</page>")[0] |
22:12
🔗
|
nico_32 |
..... |
22:12
🔗
|
|
AlsoJAA has quit IRC (Ping timeout: 745 seconds) |
22:12
🔗
|
nico_32 |
how this code worked before ?!? |
22:13
🔗
|
|
second has quit IRC (Ping timeout: 745 seconds) |
22:14
🔗
|
|
AlsoJAA_ is now known as AlsoJAA |
22:15
🔗
|
nico_32 |
773 wikidystifycom-20200827-images.txt |
22:15
🔗
|
nico_32 |
120 seconds of wait by image |
22:16
🔗
|
JAA |
At least the count is right per https://wiki.dystify.com/Special:MediaStatistics |
22:16
🔗
|
JAA |
Thanks for doing this! |
22:16
🔗
|
JAA |
This wiki is a real mess. |
22:17
🔗
|
nico_32 |
move from 120 seconds of delay to 20 seconds |
22:18
🔗
|
nico_32 |
should take 4 hours |
22:24
🔗
|
nico_32 |
https://gist.github.com/nsapa/078e12acf5648fde11efbe8fd707e2ea#file-patch-diff |
23:22
🔗
|
|
paul2520 has quit IRC (Read error: Operation timed out) |
23:30
🔗
|
|
systwi has quit IRC (Read error: Operation timed out) |
23:31
🔗
|
|
Wingy has quit IRC (Read error: Operation timed out) |
23:54
🔗
|
|
systwi has joined #wikiteam |
23:57
🔗
|
|
Wingy has joined #wikiteam |