[04:29] jack293: just updates for now [04:29] wikiapiary is down so I couldn't make a list of unarchived wikis [07:30] for how long is down wikiapiary? it died? Nemo_bis [08:31] jack293: I think it was still up last week [08:32] Specifically, I'm pretty sure it was up as of May 2 https://www.mediawiki.org/w/index.php?title=Wikimedia_Language_engineering/Reports/2018-April&diff=2770809 [09:10] 28,000 wikispaces found, still 270,000 profiles to explore [09:12] if you check html code, there is a ID, example id: '4631883', for https://iuccommonsproject.wikispaces.com/ [09:13] i have seen wikis with ID over 20 million, though I dont know if there is a way to find any wiki by its ID [09:13] so for now I rely on a spider [09:14] i assume there are over 20 million wikis (including probably deleted ones which are unreachable) [09:16] and including million of 1page (home page) low-interest wikis [12:16] *** davisonio has quit IRC () [12:16] *** davisonio has joined #wikiteam [12:38] wikispaces script is ready, it downloads and uploads to IA [12:39] have fun and report any error [13:21] 31,000 wikis found [15:18] *** logchfoo0 starts logging #wikiteam at Sun May 06 15:18:12 2018 [15:18] *** logchfoo0 has joined #wikiteam [15:53] *** charles81 has quit IRC () [15:53] *** charles81 has joined #wikiteam [18:09] 300k pages of spam, soon duly archived for posterity http://www.scanbc.com/wiki/index.php?title=Main_Page [18:10] The spambots are cheerily edit warring www.scanbc.com/wiki/index.php?diff=932789 [18:44] first 75 wikispaces archived https://archive.org/search.php?query=subject%3A%22wikispaces%22&and%5B%5D=mediatype%3A%22web%22&sort=-publicdate [19:06] 6239 MediaWiki wikis updated so far [19:07] The remaining 4k are stuck in some error... I have little energy to fix those, better just add an alternative where we download the XML from the API for all revisions 500 at a time [20:32] good job Nemo_bis [21:03] *** dashcloud has quit IRC (Remote host closed the connection)