Time |
Nickname |
Message |
00:11
🔗
|
|
Start has joined #wikiteam |
01:28
🔗
|
|
kyan has joined #wikiteam |
01:51
🔗
|
|
vitzli has joined #wikiteam |
03:45
🔗
|
|
kyan has quit IRC (This computer has gone to sleep) |
03:48
🔗
|
|
vitzli has quit IRC (Leaving) |
06:03
🔗
|
|
vitzli has joined #wikiteam |
06:11
🔗
|
|
vitzli has quit IRC (Leaving) |
06:22
🔗
|
|
vitzli has joined #wikiteam |
08:32
🔗
|
|
kyan has joined #wikiteam |
08:48
🔗
|
|
kyan has quit IRC (Leaving) |
13:57
🔗
|
* |
Nemo_bis screams https://github.com/WikiTeam/wikiteam/issues/269 |
14:09
🔗
|
vitzli |
could you share wikipuella_maginet-20160211-images.txt please? |
14:09
🔗
|
vitzli |
I'd like to stare at it |
14:17
🔗
|
Nemo_bis |
vitzli: the reporter didn't attach it |
14:18
🔗
|
vitzli |
I thought it was you, sorry |
14:59
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
15:39
🔗
|
|
Start has joined #wikiteam |
15:52
🔗
|
|
Fletcher has quit IRC (Ping timeout: 252 seconds) |
15:56
🔗
|
|
midas has quit IRC (Ping timeout: 260 seconds) |
16:05
🔗
|
|
midas has joined #wikiteam |
16:21
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
16:24
🔗
|
|
Start has joined #wikiteam |
16:25
🔗
|
|
Start has quit IRC (Remote host closed the connection) |
16:25
🔗
|
|
Start has joined #wikiteam |
16:40
🔗
|
|
Fletcher has joined #wikiteam |
17:07
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
18:37
🔗
|
|
svchfoo3 has quit IRC (Read error: Operation timed out) |
18:39
🔗
|
|
svchfoo3 has joined #wikiteam |
18:39
🔗
|
|
svchfoo1 sets mode: +o svchfoo3 |
18:42
🔗
|
|
Start has joined #wikiteam |
18:43
🔗
|
|
vitzli has quit IRC (Leaving) |
19:19
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
19:23
🔗
|
|
Start has joined #wikiteam |
20:28
🔗
|
|
ploopkazo has joined #wikiteam |
20:29
🔗
|
ploopkazo |
once a wiki is downloaded with dumpgenerator.py, is there a way to turn it into a zim for use with a zim reader like kiwix? |
20:45
🔗
|
|
Start has quit IRC (Quit: Disconnected.) |
20:55
🔗
|
Nemo_bis |
ploopkazo: not really |
20:55
🔗
|
Nemo_bis |
ploopkazo: are you trying to make a ZIM file for a dead wiki? |
21:15
🔗
|
ploopkazo |
Nemo_bis: no, it's still online |
21:17
🔗
|
Nemo_bis |
ploopkazo: then you should install Parsoid and run mwoffliner with it |
21:17
🔗
|
ploopkazo |
Nemo_bis: how does one view a dumpgenerator.py dump if not conversion to zim? |
21:17
🔗
|
Nemo_bis |
The XML dump can be parsed properly only by MediaWiki itself |
21:18
🔗
|
Nemo_bis |
dumpgenerator.py attempts to collect all the information one will need to create a clone of the original wiki |
21:18
🔗
|
ploopkazo |
oh |
21:18
🔗
|
xmc |
welllll there's another tool that reads mediawiki xml dumps https://github.com/chronomex/wikiscraper |
21:18
🔗
|
xmc |
not full fidelity but it's fun to play with |
21:19
🔗
|
xmc |
:) |
21:20
🔗
|
ploopkazo |
so if I want to convert an xml dump to zim, my only real option is to run php+mysql and then scrape my local instance with mwoffliner? |
21:21
🔗
|
Nemo_bis |
AFAIK yes |
21:21
🔗
|
ploopkazo |
how easy is it to load the dump into a mw instance? |
21:21
🔗
|
Nemo_bis |
Well, there's also the old dumpHTML way but that's a bit hacky |
21:21
🔗
|
Nemo_bis |
Depends on the wiki size and extensions |
21:23
🔗
|
ploopkazo |
what kind of extensions? I haven't run mediawiki before |
21:23
🔗
|
Nemo_bis |
Can you link the wiki in question? |
21:23
🔗
|
ploopkazo |
at the moment, https://wiki.puella-magi.net |
21:23
🔗
|
ploopkazo |
though I imagine I'll collect a lot of them once I have the process figured out |
21:24
🔗
|
Nemo_bis |
Ok, that's an easy one |
21:24
🔗
|
ploopkazo |
oh, http |
21:25
🔗
|
ploopkazo |
https isn't loading for some reason |
21:25
🔗
|
ploopkazo |
though https is the one in my history |
21:25
🔗
|
Nemo_bis |
loaded for me |
21:25
🔗
|
xmc |
pretty slow on my end |
21:25
🔗
|
ploopkazo |
weird |
21:25
🔗
|
ploopkazo |
Nemo_bis: how do you tell it's an easy one? is there an info page with the installed extensions or something? |
21:26
🔗
|
Nemo_bis |
ploopkazo: http://wiki.puella-magi.net/Special:Version you're especially interested in the parser-related things |
21:27
🔗
|
ploopkazo |
parsoid is always desirable, right? |
21:27
🔗
|
Nemo_bis |
ploopkazo: in theory yes but in practice few use it outside Wikimedia and Wikimedia-like installs |
21:27
🔗
|
Nemo_bis |
And it doesn't work even for some Wikimedia wikis yet |
21:27
🔗
|
ploopkazo |
hmm |
21:28
🔗
|
Nemo_bis |
In theory the worse that can happen is that it doesn't parse some pages |
21:28
🔗
|
ploopkazo |
which things on that wiki make it easy? |
21:28
🔗
|
Nemo_bis |
It's a small wiki and it only has the most common parser extensions |
21:28
🔗
|
ploopkazo |
oh |
21:28
🔗
|
Nemo_bis |
<gallery>, <math>, <nowiki>, <pre>, <ref> and <references> |
21:29
🔗
|
Nemo_bis |
<math> can be nasty but it's still a common one |
21:29
🔗
|
ploopkazo |
so once my xml+image dump is complete, how would I go about loading that into a local mw instance? |
21:29
🔗
|
Nemo_bis |
And of course it can happen that a wiki has a custom parser extension nobody has the code of |
21:30
🔗
|
Nemo_bis |
ploopkazo: https://meta.wikimedia.org/wiki/Data_dumps/Tools_for_importing |
21:30
🔗
|
Nemo_bis |
ploopkazo: in theory Parsoid is near to the point where it doesn't even need MediaWiki to parse the wikitext; I'm not sure if someone tried yet |
21:31
🔗
|
Nemo_bis |
Maybe for simple wikis it works, in that case you'd "just" run the node.js service, feed it with wikitext from the XML and get your HTML |
21:32
🔗
|
Nemo_bis |
No idea how it expands templates though |
21:34
🔗
|
Nemo_bis |
ploopkazo: ah and if you want to make this scale you probably need to look into installation automatisms like https://www.mediawiki.org/wiki/MediaWiki-Vagrant or https://github.com/wikimedia/mediawiki-containers |
21:36
🔗
|
ploopkazo |
Nemo_bis: is the format dumpgenerator.py creates practically identical to the format wikimedia releases their dumps in? |
21:36
🔗
|
Nemo_bis |
Now sorry if I overloaded you with information instead of making your life simpler... but if you succeed, that's the holy grail. :) |
21:37
🔗
|
Nemo_bis |
ploopkazo: the format is identical for all wikis (given a release, of course). |
21:37
🔗
|
Nemo_bis |
ploopkazo: but each page is just a long blob of text that might contain anything |
21:38
🔗
|
Nemo_bis |
If your question is whether mwimport is supposed to work, the answer is (suprisingly) yes |
21:39
🔗
|
Nemo_bis |
The basics of the database schema for MediaWiki didn't change since 2005 when mwimport was created |
21:41
🔗
|
Nemo_bis |
Anyway, for more specific help if you get stuck somewhere ask cscott (for Parsoid-without-MediaWiki) and Kelson on #kiwix (for mwoffliner, dumpHTML etc.), both on Freenode. Now I'm going to bed :) |
21:46
🔗
|
ploopkazo |
thanks |
23:15
🔗
|
|
Start has joined #wikiteam |