#wikiteam 2013-10-09,Wed

↑back Search

Time Nickname Message
19:02 πŸ”— kyan I'm trying to use dumpgenerator.py to archive http://www.frathwiki.com/, but I'm getting "Error in api.php, please, provide a correct path to api.php". The command I'm running: python ../dumpgenerator.py --api=http://www.frathwiki.com/api.php --xml --images Any thoughts?
19:26 πŸ”— Nemo_bis kyan: try adding also the --index=
19:27 πŸ”— Nemo_bis sometimes it's stupid enough to be the reason, though it shouldn't because it's in the same path http://www.frathwiki.com/index.php
19:32 πŸ”— Nemo_bis no idea why it would fail https://code.google.com/p/wikiteam/source/browse/trunk/dumpgenerator.py#901
19:33 πŸ”— balrog UA blocking
19:33 πŸ”— balrog <title>Access denied | www.frathwiki.com used CloudFlare to restrict access</title>
19:33 πŸ”— balrog oops
19:34 πŸ”— balrog <p>The owner of this website (www.frathwiki.com) has banned your access based on your browser's signature (bad82647cc6077f-ua48).</p>
19:36 πŸ”— balrog it's more than just UA
19:36 πŸ”— balrog Nemo_bis: ^
19:38 πŸ”— balrog https://support.cloudflare.com/hc/en-us/articles/200170086-What-does-the-Browser-Integrity-Check-do-
19:38 πŸ”— Nemo_bis hmm
19:39 πŸ”— balrog wget and curl work
19:40 πŸ”— Nemo_bis yes, that's what confused me :)
19:41 πŸ”— Nemo_bis are they blocking urllib completely?
19:41 πŸ”— balrog they're finding some way to block it
19:42 πŸ”— balrog I changed the first two lines of checkAPI to the following and it works:
19:42 πŸ”— balrog f = urllib2.urlopen(req)
19:42 πŸ”— balrog req = urllib2.Request(url=api, headers={'User-Agent': getUserAgent()})
19:43 πŸ”— balrog getPageTitlesScraper may still be broken
19:43 πŸ”— balrog they're probably blocking the urllib UA
19:43 πŸ”— Nemo_bis yeah
19:43 πŸ”— kyan strangeҀ¦
19:43 πŸ”— Nemo_bis I was doing the same change at the same time
19:43 πŸ”— balrog and the script isn't consistent and is using urllib in a few places and urllib2 everywhere else
19:44 πŸ”— Nemo_bis indeed
19:49 πŸ”— Nemo_bis ok, fixed that one (thanks balrog), only two urllib.urlopen left :)
19:49 πŸ”— balrog probably should fix those as well or those gets will fail
19:51 πŸ”— Nemo_bis yeah but they're there only for the older wikis, this one shouldn't be affected
20:02 πŸ”— kyan The new version of the script is working now :)
20:02 πŸ”— kyan thanks!
20:14 πŸ”— Nemo_bis kyan: thank you for reporting, it was a rather embarrassing overlook :)

irclogger-viewer