[07:01] oh no! the upload of the 50+ GiB archive has started before I had time to fix the curl line to give me progress info [07:01] On the other hand, speed seems reasonable now [07:02] http://archive.org/details/wiki-halowikinet took three times what my bandwidth would have allowed [07:03] Downloaded 50030 pages :o [07:03] Hm, so little? [07:03] :P [07:03] that's the biggest one I've done so far [07:03] Seriously, I'm at several dozens millions [07:04] ah ok, a single wiki [07:04] My biggest one is a million and a half I think [07:04] wow [07:04] that's cool [07:14] metacafe is 7 https://wikistats.wmflabs.org/display.php?t=mw&s=total_desc [07:15] underscor: it would be wonderful if you could try werelate, it doesn't have an API so parsing HTML list takes ages [07:15] and I mean weeks on an average server [07:16] oh wow [07:16] what incantation do I need to run? [07:19] hm? [07:19] just dumpgenerator.py on it index.php [07:21] I didn't know if there were any args I'm supposed to pass [07:21] I haven'r run dumpgenerator in like 7+ months [07:21] haven't* [07:25] underscor: nothing special, just the usual --index=URL --xml --images [09:08] hmm the log check is producing some more dumps http://p.defau.lt/?_8hfXraVny3wzx_HYzG9Wg [10:41] emijrp: now we could start using http://archive.org/search.php?query=originalurl%3Astupidedia [10:41] wikistats.wmflabs.org could link to the dumps via an originalurl: search of the api url [10:42] and maybe even fetch the last-updated-date [12:37] http://dpwiki.slipgateconstruct.com/ might shut down, http://dpwiki.slipgateconstruct.com/api.php [16:57] dumping [18:40] thanks [20:25] ugh, very slow :( [20:25] weird, only 32 pages? [20:27] heh, might be, actually [20:57] Special:Statistics