#wikiteam 2011-07-16,Sat

↑back Search

Time Nickname Message
07:24 🔗 emijrp Nemo_bis: perhaps you are interested in #archiveteam channel
07:24 🔗 Nemo_bis hm, yes, thank you
07:25 🔗 Nemo_bis although more interesting things always means less time :-p
07:25 🔗 Nemo_bis I checked 991datasets.org, it's ok and it has all pages
07:25 🔗 Nemo_bis uh, 911
07:26 🔗 emijrp how many?
07:26 🔗 Nemo_bis only ~200k
07:26 🔗 Nemo_bis that's what the statistics say about that wiki
07:26 🔗 emijrp and grep?
07:28 🔗 Nemo_bis ok
07:29 🔗 Nemo_bis Also, I saw that s23 has about 1400 independent MediaWikis, I wonder if it's more or less complete than that weird list.
07:29 🔗 Nemo_bis I wrote http://s23.org/wiki/Talk:Wikistats/ToDo#Upgrade_to_API_and_automatic_bulk_additions
07:32 🔗 emijrp nice trick siteinfo for license
07:35 🔗 Nemo_bis there's also http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=rightsinfo
07:36 🔗 Nemo_bis which is better because it shows the URL
07:36 🔗 Nemo_bis (see also https://bugzilla.wikimedia.org/show_bug.cgi?id=29918 )
07:36 🔗 emijrp are you using it to find big free wikis?
07:37 🔗 Nemo_bis I've used it until now, I've downloaded or I'm downloading all the biggest ones
07:37 🔗 emijrp we can do a script to scan a list of big independent wikis, and save free ones first
07:37 🔗 Nemo_bis yes
07:38 🔗 Nemo_bis there are 1400 of them on s23
07:38 🔗 Nemo_bis no idea how many are free
07:38 🔗 Nemo_bis sadly, often people don't set the configuration correctly and write the license in some random page .-/
07:39 🔗 emijrp yes, but if only 10% are free, = 140 wikis
07:39 🔗 emijrp work for a month
07:44 🔗 Nemo_bis nice curl trick: �If you want a progress meter for HTTP POST or PUT requests, you need to redirect the response output to a file, using shell redirect (>), -o [file] or similar.�
07:45 🔗 Nemo_bis it's hard to upload 20 GiB files to IA without progress information and only some clues from nethogs .-p
07:46 🔗 emijrp dont you use ftp client?
07:46 🔗 emijrp gFTP shows progress bar
07:54 🔗 Nemo_bis nooo, FTP is horrible
07:54 🔗 Nemo_bis I use http://www.archive.org/help/abouts3.txt
07:56 🔗 emijrp why?
07:56 🔗 Nemo_bis FTP is totally unreliable on IA
07:56 🔗 Nemo_bis and slow
07:57 🔗 Nemo_bis drove me crazy
07:57 🔗 Nemo_bis this way it never fails and sometimes I reach 1+ MiB/s upload
09:48 🔗 emijrp Nemo_bis: do you know this tool? http://toolserver.org/~emijrp/imagesforbio/stats.php
09:48 🔗 Nemo_bis no
09:49 🔗 emijrp best one on toolserver : P
09:49 🔗 * Nemo_bis loking
09:49 🔗 emijrp open KK:
09:50 🔗 Nemo_bis it looks for images with the same name as the article (in many languages, I suppose)?
09:50 🔗 emijrp it uses interwikis
09:51 🔗 emijrp not the same article name needed
09:51 🔗 Nemo_bis aren't there multiple tools similar to this?
09:51 🔗 Nemo_bis ok
09:51 🔗 emijrp lol no
09:51 🔗 Nemo_bis what's the difference to http://toolserver.org/~magnus/fist.php ?
09:52 🔗 emijrp in FIST you have to search handy
09:52 🔗 emijrp my tool offers you a batch of biographies to illustrate
09:52 🔗 emijrp press "add image" and add
09:52 🔗 emijrp press "add image" and add
09:52 🔗 emijrp press "add image" and add
09:52 🔗 emijrp press "add image" and add
09:52 🔗 emijrp press "add image" and add
09:55 🔗 * Nemo_bis trying
09:55 🔗 emijrp read the instructions in the top of page
12:44 🔗 Nemo_bis emijrp, I meant things like http://tools.wikimedia.de/~magnus/fist.php?doit=1&language=it&project=wikipedia&data=Scrittori+statunitensi&datatype=categories&params%5Bcatdepth%5D=0&params%5Brandom%5D=50&params%5Bstartat%5D=&params%5Bll_max%5D=5&params%5Bcommons_max%5D=5&params%5Bflickr_max%5D=5&params%5Binclude_flickr_id%5D=1&params%5Bpicasa_max%5D=5&params%5Bwts_max%5D=5&params%5Bgimp_max%5D=5&params%5Besp_max%5D=5&params%5Bab_max%5D=5&pa
12:44 🔗 Nemo_bis rams%5Bgeograph_max%5D=5&params%5Bgeograph_max_de%5D=5&params%5Bgeograph_max_channel-islands%5D=5&params%5Bfreemages_max%5D=5&params%5Bforarticles%5D=noimage&params%5Blessthan_images%5D=3&params%5Bdefault_thumbnail_size%5D=&params%5Bjpeg%5D=1&params%5Bpng%5D=1&params%5Bgif%5D=1&params%5Bsvg%5D=1&params%5Bogg%5D=1&params%5Boutput_format%5D=out_html&params%5Bmin_width%5D=80&params%5Bmin_height%5D=80&sources%5Blanguagelinks%5D=1
12:44 🔗 Nemo_bis argh :-O
12:45 🔗 emijrp tinyur
12:45 🔗 Nemo_bis http://ur1.ca/4ppv3
12:45 🔗 emijrp l
12:46 🔗 emijrp slow
12:49 🔗 Nemo_bis terribly slow, yes
12:49 🔗 Nemo_bis The difference is that this finds images to be uploaded to Commons, apparently.
12:54 🔗 Nemo_bis No, it finds also images on Commons: http://ur1.ca/4ppww
12:54 🔗 Nemo_bis But gives no direct link to add them, only prefilled markup.
19:46 🔗 Nemo_bis argh, I needed emijrp now

irclogger-viewer