[07:24] Nemo_bis: perhaps you are interested in #archiveteam channel [07:24] hm, yes, thank you [07:25] although more interesting things always means less time :-p [07:25] I checked 991datasets.org, it's ok and it has all pages [07:25] uh, 911 [07:26] how many? [07:26] only ~200k [07:26] that's what the statistics say about that wiki [07:26] and grep? [07:28] ok [07:29] Also, I saw that s23 has about 1400 independent MediaWikis, I wonder if it's more or less complete than that weird list. [07:29] I wrote http://s23.org/wiki/Talk:Wikistats/ToDo#Upgrade_to_API_and_automatic_bulk_additions [07:32] nice trick siteinfo for license [07:35] there's also http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=rightsinfo [07:36] which is better because it shows the URL [07:36] (see also https://bugzilla.wikimedia.org/show_bug.cgi?id=29918 ) [07:36] are you using it to find big free wikis? [07:37] I've used it until now, I've downloaded or I'm downloading all the biggest ones [07:37] we can do a script to scan a list of big independent wikis, and save free ones first [07:37] yes [07:38] there are 1400 of them on s23 [07:38] no idea how many are free [07:38] sadly, often people don't set the configuration correctly and write the license in some random page .-/ [07:39] yes, but if only 10% are free, = 140 wikis [07:39] work for a month [07:44] nice curl trick: �If you want a progress meter for HTTP POST or PUT requests, you need to redirect the response output to a file, using shell redirect (>), -o [file] or similar.� [07:45] it's hard to upload 20 GiB files to IA without progress information and only some clues from nethogs .-p [07:46] dont you use ftp client? [07:46] gFTP shows progress bar [07:54] nooo, FTP is horrible [07:54] I use http://www.archive.org/help/abouts3.txt [07:56] why? [07:56] FTP is totally unreliable on IA [07:56] and slow [07:57] drove me crazy [07:57] this way it never fails and sometimes I reach 1+ MiB/s upload [09:48] Nemo_bis: do you know this tool? http://toolserver.org/~emijrp/imagesforbio/stats.php [09:48] no [09:49] best one on toolserver : P [09:49] * Nemo_bis loking [09:49] open KK: [09:50] it looks for images with the same name as the article (in many languages, I suppose)? [09:50] it uses interwikis [09:51] not the same article name needed [09:51] aren't there multiple tools similar to this? [09:51] ok [09:51] lol no [09:51] what's the difference to http://toolserver.org/~magnus/fist.php ? [09:52] in FIST you have to search handy [09:52] my tool offers you a batch of biographies to illustrate [09:52] press "add image" and add [09:52] press "add image" and add [09:52] press "add image" and add [09:52] press "add image" and add [09:52] press "add image" and add [09:55] * Nemo_bis trying [09:55] read the instructions in the top of page [12:44] emijrp, I meant things like http://tools.wikimedia.de/~magnus/fist.php?doit=1&language=it&project=wikipedia&data=Scrittori+statunitensi&datatype=categories¶ms%5Bcatdepth%5D=0¶ms%5Brandom%5D=50¶ms%5Bstartat%5D=¶ms%5Bll_max%5D=5¶ms%5Bcommons_max%5D=5¶ms%5Bflickr_max%5D=5¶ms%5Binclude_flickr_id%5D=1¶ms%5Bpicasa_max%5D=5¶ms%5Bwts_max%5D=5¶ms%5Bgimp_max%5D=5¶ms%5Besp_max%5D=5¶ms%5Bab_max%5D=5&pa [12:44] rams%5Bgeograph_max%5D=5¶ms%5Bgeograph_max_de%5D=5¶ms%5Bgeograph_max_channel-islands%5D=5¶ms%5Bfreemages_max%5D=5¶ms%5Bforarticles%5D=noimage¶ms%5Blessthan_images%5D=3¶ms%5Bdefault_thumbnail_size%5D=¶ms%5Bjpeg%5D=1¶ms%5Bpng%5D=1¶ms%5Bgif%5D=1¶ms%5Bsvg%5D=1¶ms%5Bogg%5D=1¶ms%5Boutput_format%5D=out_html¶ms%5Bmin_width%5D=80¶ms%5Bmin_height%5D=80&sources%5Blanguagelinks%5D=1 [12:44] argh :-O [12:45] tinyur [12:45] http://ur1.ca/4ppv3 [12:45] l [12:46] slow [12:49] terribly slow, yes [12:49] The difference is that this finds images to be uploaded to Commons, apparently. [12:54] No, it finds also images on Commons: http://ur1.ca/4ppww [12:54] But gives no direct link to add them, only prefilled markup. [19:46] argh, I needed emijrp now