Time |
Nickname |
Message |
05:52
🔗
|
underscor |
Trying to convince ariel to give IA a copy of all their deleted images |
05:52
🔗
|
underscor |
>:| |
05:52
🔗
|
underscor |
haha |
05:58
🔗
|
ersi |
who be this ariel? |
06:00
🔗
|
underscor |
He's (one of? the?) head tech guys for wikimedia foundation |
06:53
🔗
|
Nemo_bis |
more or less |
08:40
🔗
|
ersi |
ah |
17:47
🔗
|
emijrp |
Nemo_bis: im thinking about that wikistats project |
17:47
🔗
|
Nemo_bis |
emijrp, good |
17:47
🔗
|
emijrp |
it is all php? |
17:47
🔗
|
Nemo_bis |
I have no idea |
17:47
🔗
|
Nemo_bis |
did you think of the images tarballs? |
17:47
🔗
|
Nemo_bis |
the rsync mirror should make things much faster |
17:48
🔗
|
emijrp |
im thinking about adding some pie charts (mediawiki, dokuwiki, ... % comparison), langs, size, active/inactive ones, etc |
17:48
🔗
|
Nemo_bis |
emijrp, also, did you see that list of API urls and what do you think? |
17:48
🔗
|
Nemo_bis |
I'm getting hundreds of errors of all sorts |
17:48
🔗
|
emijrp |
probably wget is not ok for that |
17:48
🔗
|
Nemo_bis |
apparently even functioning api.php handles are not actually able to give any output, alwayas DB problems |
17:48
🔗
|
Nemo_bis |
it's not wget |
17:54
🔗
|
emijrp |
Nemo_bis: i dont know if rsync mirror allows to downnload images by date |
17:54
🔗
|
emijrp |
or just all |
17:55
🔗
|
Nemo_bis |
I don't know, probably you can set some filter |
17:55
🔗
|
Nemo_bis |
or you'll need some post-processing |
17:55
🔗
|
Nemo_bis |
bt could avoid some headaches with corrupted images and so on I think |
17:56
🔗
|
Nemo_bis |
or, to download all the stuff at once very fast and then worry only about packaging |
17:56
🔗
|
emijrp |
the problem is that you cant download all and then filter |
17:56
🔗
|
emijrp |
commons is 18TB |
17:56
🔗
|
emijrp |
you have to filter and the downlaod |
17:56
🔗
|
Nemo_bis |
maybe IA can give a machine with 18 TB for a shot while |
17:57
🔗
|
Nemo_bis |
wasn't 18TB including all wikis? |
17:57
🔗
|
Nemo_bis |
not that it changes much |
17:57
🔗
|
Nemo_bis |
if you find with underscor a fast path to archive all those images on IA, I'm sure he can find the resources |
17:58
🔗
|
emijrp |
by the way, the rsync method doesnt download the .xml description |
17:59
🔗
|
emijrp |
i think our script is better |
18:17
🔗
|
Nemo_bis |
hm |
18:17
🔗
|
Nemo_bis |
but WMF won't be happy if you start mass downloading all Commons |
18:18
🔗
|
emijrp |
i dont care |
18:18
🔗
|
emijrp |
and they dont care, dont worry |
21:01
🔗
|
underscor |
damn, missed emijrp again |
21:01
🔗
|
underscor |
I wish he'd stay longer! |