| Time |
Nickname |
Message |
|
13:55
🔗
|
Hydriz |
Eh, emijrp, are you free now? |
|
13:55
🔗
|
emijrp |
depend |
|
13:55
🔗
|
emijrp |
lul |
|
13:55
🔗
|
Hydriz |
okok, just a short one |
|
13:55
🔗
|
Hydriz |
is there going to be anything done to the Wikimedia Commons grab? |
|
13:56
🔗
|
emijrp |
i reported some bugs, but they are not fixed (in the same way Nemo_bis report bugs to me and i dont fix them; KARMA RETURNS) |
|
13:56
🔗
|
Hydriz |
heh |
|
13:56
🔗
|
emijrp |
do you have downloaded many GB? |
|
13:56
🔗
|
Nemo_bis |
lol |
|
13:56
🔗
|
Hydriz |
anyways, I finished downloading |
|
13:57
🔗
|
Hydriz |
eh, about 120GB or so |
|
13:57
🔗
|
Nemo_bis |
same here, but I deleted everything now |
|
13:57
🔗
|
Nemo_bis |
to make room for the wikis |
|
13:57
🔗
|
Hydriz |
some month grabs looks good, so I transferred them to the IA already |
|
13:57
🔗
|
Nemo_bis |
hmmm |
|
13:57
🔗
|
emijrp |
Hydriz: cool |
|
13:57
🔗
|
Nemo_bis |
weren't we supposed o wait |
|
13:57
🔗
|
Hydriz |
http://archive.org/details/wikimediacommons-200606 |
|
13:57
🔗
|
Hydriz |
heh Nemo |
|
13:57
🔗
|
Hydriz |
rules are meant to be broken :P |
|
13:58
🔗
|
Nemo_bis |
oic |
|
13:58
🔗
|
Hydriz |
I upload June + January 2006 |
|
13:58
🔗
|
Nemo_bis |
id you include the output of the checker? |
|
13:58
🔗
|
Nemo_bis |
(or the log, I don't remember what's there) |
|
13:58
🔗
|
Hydriz |
lol no |
|
13:59
🔗
|
Hydriz |
I am clearing stuff off the Labs project |
|
13:59
🔗
|
emijrp |
i think it is ok you upload whatever you have about Commons, that MediaWiki developers are not going to solve a damn |
|
13:59
🔗
|
emijrp |
so, upload |
|
14:00
🔗
|
emijrp |
perhaps it contains some broken images, but, better than nothing |
|
14:00
🔗
|
Hydriz |
we want moar |
|
14:00
🔗
|
Hydriz |
yeah, around 10 - 15 per day |
|
14:00
🔗
|
Hydriz |
* broken images |
|
14:01
🔗
|
Nemo_bis |
actually some errors were fixed |
|
14:01
🔗
|
Hydriz |
but issue 45 is the burning issue |
|
14:02
🔗
|
Hydriz |
its preventing many days to be grabbed |
|
14:02
🔗
|
Hydriz |
so I am putting them on hold before I upload |
|
14:02
🔗
|
Hydriz |
or maybe I should upload... |
|
14:03
🔗
|
emijrp |
no, wait |
|
14:03
🔗
|
Hydriz |
its not affecting other days though |
|
14:03
🔗
|
emijrp |
the only months unaffected by issue 45 are january and june? |
|
14:03
🔗
|
Hydriz |
yep |
|
14:04
🔗
|
Nemo_bis |
miracle |
|
14:04
🔗
|
Hydriz |
but a month that is affected, is only isolated to the few days |
|
14:04
🔗
|
Hydriz |
yeah, some encoding issue that commonsdownloader.py refuses to resolve |
|
14:04
🔗
|
Hydriz |
like slashes or other symbols |
|
14:07
🔗
|
Hydriz |
I probably can start on July - December soon, and then we can put pressure to make more commonssql.csv s |
|
14:07
🔗
|
emijrp |
looks like the bug only affects to old versions |
|
14:07
🔗
|
emijrp |
but i will try to fix it anyway |
|
14:07
🔗
|
Hydriz |
heh |
|
14:07
🔗
|
Hydriz |
but it shouldn't be of top priority anyway |
|
14:08
🔗
|
emijrp |
can you paste the wget call ? |
|
14:09
🔗
|
emijrp |
https://code.google.com/p/wikiteam/issues/detail?id=45 |
|
14:09
🔗
|
Hydriz |
wha..what? |
|
14:09
🔗
|
emijrp |
just before wget stat |
|
14:09
🔗
|
emijrp |
starts |
|
14:09
🔗
|
Hydriz |
lol |
|
14:09
🔗
|
* |
Hydriz shall start the script again |
|
14:10
🔗
|
emijrp |
it skips to the last downlaoding image |
|
14:10
🔗
|
emijrp |
right'? |
|
14:10
🔗
|
emijrp |
i dont remember.. |
|
14:11
🔗
|
Hydriz |
right, give me a few minutes |
|
14:11
🔗
|
Hydriz |
(or give the script a few more minutes) |
|
14:12
🔗
|
emijrp |
just donwload 2006-02-05 |
|
14:12
🔗
|
Hydriz |
yep |
|
14:12
🔗
|
Hydriz |
Doing... |
|
14:13
🔗
|
emijrp |
the issue is that wget saves it like 2006/02/05/20070605200920!US__reverse.jpg but the eral name is 2006/02/05/20070605200920!US_$100_reverse.jpg |
|
14:13
🔗
|
emijrp |
i dont know if wget eats the $, or ... |
|
14:14
🔗
|
Hydriz |
If I recall vaguely, its the downloader that is eating it, or something |
|
14:14
🔗
|
Hydriz |
but anyway, our taskforce seems to be going well? |
|
14:14
🔗
|
Hydriz |
the nemo dominance |
|
14:17
🔗
|
emijrp |
ok |
|
14:17
🔗
|
emijrp |
about the metadata of items |
|
14:17
🔗
|
emijrp |
we need to add the ZIP links to explore the images |
|
14:18
🔗
|
emijrp |
and a link back to WikiTeam Google Code |
|
14:18
🔗
|
Hydriz |
thats mad |
|
14:18
🔗
|
Hydriz |
link, yes |
|
14:18
🔗
|
Hydriz |
but ZIP links, 31 times... |
|
14:19
🔗
|
emijrp |
yes, that is easy, copy paste or a tiny script |
|
14:19
🔗
|
emijrp |
to generate a cool HTML table |
|
14:19
🔗
|
* |
Hydriz is feeling lazy right now... |
|
14:21
🔗
|
Hydriz |
wait wait |
|
14:21
🔗
|
Hydriz |
the wget call? |
|
14:21
🔗
|
Hydriz |
isn't it already in the paste inside my comment? |
|
14:22
🔗
|
Hydriz |
unless you meant a line above that |
|
14:22
🔗
|
Hydriz |
which is just the file name |
|
14:23
🔗
|
emijrp |
yes, a line above |
|
14:23
🔗
|
Hydriz |
damn |
|
14:24
🔗
|
Hydriz |
a small oversight |
|
14:29
🔗
|
emijrp |
when you paste that line (i hope it is shown and not hidden inside the os.system() call), i will check |
|
14:29
🔗
|
emijrp |
i can add a try: except: too and skip that error |
|
14:29
🔗
|
emijrp |
it looks like only affects to old versions |
|
14:30
🔗
|
Hydriz |
maybe... |
|
14:30
🔗
|
Hydriz |
but thats all the errors I got |
|
14:30
🔗
|
Hydriz |
8 times |
|
14:30
🔗
|
emijrp |
8 times where? |
|
14:31
🔗
|
emijrp |
ah ok |
|
14:31
🔗
|
Hydriz |
means that this bug affected the grab of 8 days |
|
14:31
🔗
|
emijrp |
okok |
|
14:31
🔗
|
emijrp |
not relevant for the big picture |
|
14:32
🔗
|
emijrp |
18TB of images and fails 8 images |
|
14:32
🔗
|
Hydriz |
lol |
|
14:32
🔗
|
emijrp |
MAN. |
|
14:32
🔗
|
emijrp |
well, really 1 or 2 pictures by day |
|
14:32
🔗
|
Hydriz |
hmm, lemme look at the IA blog post doc... |
|
14:33
🔗
|
emijrp |
oh, i forgot to add a line to that post about the commons download |
|
14:34
🔗
|
emijrp |
add it |
|
14:34
🔗
|
* |
Hydriz is stunned about what to do |
|
14:35
🔗
|
emijrp |
a comment about we have to skeap about the wikimedia commons downloader task |
|
14:35
🔗
|
emijrp |
to the google doc |
|
14:35
🔗
|
emijrp |
speak* |
|
14:38
🔗
|
Hydriz |
ah, the download is now in the old versions... |
|
14:39
🔗
|
Hydriz |
got it |
|
14:39
🔗
|
Hydriz |
I shall post it on the issue page |
|
14:40
🔗
|
Hydriz |
emijrp: ping |
|
14:40
🔗
|
emijrp |
ok |
|
14:41
🔗
|
Hydriz |
yeah, it seems like its wget |
|
14:42
🔗
|
emijrp |
weird, but ok |
|
14:42
🔗
|
emijrp |
i will think about it, and, if i dont see a clear solution, i will just add a try: ecept: and skip that shit |
|
14:44
🔗
|
Hydriz |
lol |
|
14:50
🔗
|
Hydriz |
hmm, thinking about it, there isn't really much I know that I can write in the blog post |
|
14:58
🔗
|
emijrp |
at the begining it is hard to write |
|
14:58
🔗
|
emijrp |
later we will need more pages |
|
14:58
🔗
|
Hydriz |
lol |
|
14:58
🔗
|
Hydriz |
1 week left |
|
14:58
🔗
|
Hydriz |
though |
|
14:59
🔗
|
Hydriz |
anyway, can I start uploading the files for the Wikimedia Commons grab? |
|
14:59
🔗
|
Hydriz |
the rest of them |
|
15:00
🔗
|
emijrp |
if you can modify the items later and add the missing .zip .. |
|
15:00
🔗
|
Hydriz |
yep |
|
15:01
🔗
|
Hydriz |
but still I got to wait for the dvds to get uploaded |
|
15:01
🔗
|
Hydriz |
why do people want to make DVDs of Wikipedia... |
|
15:01
🔗
|
Hydriz |
zzz |
|
15:01
🔗
|
emijrp |
dvs? |
|
15:02
🔗
|
Hydriz |
http://dumps.wikimedia.org/dvd.html |
|
15:06
🔗
|
emijrp |
because there are people without internet |
|
15:06
🔗
|
emijrp |
CDPedia is the Spanish Wikipedia CD, it is useful for El Salvador and other South MAerican countries. |
|
15:07
🔗
|
Hydriz |
I see |
|
15:07
🔗
|
Hydriz |
yep, its on the IA |
|
15:07
🔗
|
Hydriz |
I am just left with dewiki |
|
15:07
🔗
|
Hydriz |
2 more files |
|
15:29
🔗
|
Hydriz |
Right, good night people |
|
15:30
🔗
|
Hydriz |
got to sleep for long day tomorrow :) |