#wikiteam 2012-04-30,Mon

↑back Search

Time Nickname Message
13:55 🔗 Hydriz Eh, emijrp, are you free now?
13:55 🔗 emijrp depend
13:55 🔗 emijrp lul
13:55 🔗 Hydriz okok, just a short one
13:55 🔗 Hydriz is there going to be anything done to the Wikimedia Commons grab?
13:56 🔗 emijrp i reported some bugs, but they are not fixed (in the same way Nemo_bis report bugs to me and i dont fix them; KARMA RETURNS)
13:56 🔗 Hydriz heh
13:56 🔗 emijrp do you have downloaded many GB?
13:56 🔗 Nemo_bis lol
13:56 🔗 Hydriz anyways, I finished downloading
13:57 🔗 Hydriz eh, about 120GB or so
13:57 🔗 Nemo_bis same here, but I deleted everything now
13:57 🔗 Nemo_bis to make room for the wikis
13:57 🔗 Hydriz some month grabs looks good, so I transferred them to the IA already
13:57 🔗 Nemo_bis hmmm
13:57 🔗 emijrp Hydriz: cool
13:57 🔗 Nemo_bis weren't we supposed o wait
13:57 🔗 Hydriz http://archive.org/details/wikimediacommons-200606
13:57 🔗 Hydriz heh Nemo
13:57 🔗 Hydriz rules are meant to be broken :P
13:58 🔗 Nemo_bis oic
13:58 🔗 Hydriz I upload June + January 2006
13:58 🔗 Nemo_bis id you include the output of the checker?
13:58 🔗 Nemo_bis (or the log, I don't remember what's there)
13:58 🔗 Hydriz lol no
13:59 🔗 Hydriz I am clearing stuff off the Labs project
13:59 🔗 emijrp i think it is ok you upload whatever you have about Commons, that MediaWiki developers are not going to solve a damn
13:59 🔗 emijrp so, upload
14:00 🔗 emijrp perhaps it contains some broken images, but, better than nothing
14:00 🔗 Hydriz we want moar
14:00 🔗 Hydriz yeah, around 10 - 15 per day
14:00 🔗 Hydriz * broken images
14:01 🔗 Nemo_bis actually some errors were fixed
14:01 🔗 Hydriz but issue 45 is the burning issue
14:02 🔗 Hydriz its preventing many days to be grabbed
14:02 🔗 Hydriz so I am putting them on hold before I upload
14:02 🔗 Hydriz or maybe I should upload...
14:03 🔗 emijrp no, wait
14:03 🔗 Hydriz its not affecting other days though
14:03 🔗 emijrp the only months unaffected by issue 45 are january and june?
14:03 🔗 Hydriz yep
14:04 🔗 Nemo_bis miracle
14:04 🔗 Hydriz but a month that is affected, is only isolated to the few days
14:04 🔗 Hydriz yeah, some encoding issue that commonsdownloader.py refuses to resolve
14:04 🔗 Hydriz like slashes or other symbols
14:07 🔗 Hydriz I probably can start on July - December soon, and then we can put pressure to make more commonssql.csv s
14:07 🔗 emijrp looks like the bug only affects to old versions
14:07 🔗 emijrp but i will try to fix it anyway
14:07 🔗 Hydriz heh
14:07 🔗 Hydriz but it shouldn't be of top priority anyway
14:08 🔗 emijrp can you paste the wget call ?
14:09 🔗 emijrp https://code.google.com/p/wikiteam/issues/detail?id=45
14:09 🔗 Hydriz wha..what?
14:09 🔗 emijrp just before wget stat
14:09 🔗 emijrp starts
14:09 🔗 Hydriz lol
14:09 🔗 * Hydriz shall start the script again
14:10 🔗 emijrp it skips to the last downlaoding image
14:10 🔗 emijrp right'?
14:10 🔗 emijrp i dont remember..
14:11 🔗 Hydriz right, give me a few minutes
14:11 🔗 Hydriz (or give the script a few more minutes)
14:12 🔗 emijrp just donwload 2006-02-05
14:12 🔗 Hydriz yep
14:12 🔗 Hydriz Doing...
14:13 🔗 emijrp the issue is that wget saves it like 2006/02/05/20070605200920!US__reverse.jpg but the eral name is 2006/02/05/20070605200920!US_$100_reverse.jpg
14:13 🔗 emijrp i dont know if wget eats the $, or ...
14:14 🔗 Hydriz If I recall vaguely, its the downloader that is eating it, or something
14:14 🔗 Hydriz but anyway, our taskforce seems to be going well?
14:14 🔗 Hydriz the nemo dominance
14:17 🔗 emijrp ok
14:17 🔗 emijrp about the metadata of items
14:17 🔗 emijrp we need to add the ZIP links to explore the images
14:18 🔗 emijrp and a link back to WikiTeam Google Code
14:18 🔗 Hydriz thats mad
14:18 🔗 Hydriz link, yes
14:18 🔗 Hydriz but ZIP links, 31 times...
14:19 🔗 emijrp yes, that is easy, copy paste or a tiny script
14:19 🔗 emijrp to generate a cool HTML table
14:19 🔗 * Hydriz is feeling lazy right now...
14:21 🔗 Hydriz wait wait
14:21 🔗 Hydriz the wget call?
14:21 🔗 Hydriz isn't it already in the paste inside my comment?
14:22 🔗 Hydriz unless you meant a line above that
14:22 🔗 Hydriz which is just the file name
14:23 🔗 emijrp yes, a line above
14:23 🔗 Hydriz damn
14:24 🔗 Hydriz a small oversight
14:29 🔗 emijrp when you paste that line (i hope it is shown and not hidden inside the os.system() call), i will check
14:29 🔗 emijrp i can add a try: except: too and skip that error
14:29 🔗 emijrp it looks like only affects to old versions
14:30 🔗 Hydriz maybe...
14:30 🔗 Hydriz but thats all the errors I got
14:30 🔗 Hydriz 8 times
14:30 🔗 emijrp 8 times where?
14:31 🔗 emijrp ah ok
14:31 🔗 Hydriz means that this bug affected the grab of 8 days
14:31 🔗 emijrp okok
14:31 🔗 emijrp not relevant for the big picture
14:32 🔗 emijrp 18TB of images and fails 8 images
14:32 🔗 Hydriz lol
14:32 🔗 emijrp MAN.
14:32 🔗 emijrp well, really 1 or 2 pictures by day
14:32 🔗 Hydriz hmm, lemme look at the IA blog post doc...
14:33 🔗 emijrp oh, i forgot to add a line to that post about the commons download
14:34 🔗 emijrp add it
14:34 🔗 * Hydriz is stunned about what to do
14:35 🔗 emijrp a comment about we have to skeap about the wikimedia commons downloader task
14:35 🔗 emijrp to the google doc
14:35 🔗 emijrp speak*
14:38 🔗 Hydriz ah, the download is now in the old versions...
14:39 🔗 Hydriz got it
14:39 🔗 Hydriz I shall post it on the issue page
14:40 🔗 Hydriz emijrp: ping
14:40 🔗 emijrp ok
14:41 🔗 Hydriz yeah, it seems like its wget
14:42 🔗 emijrp weird, but ok
14:42 🔗 emijrp i will think about it, and, if i dont see a clear solution, i will just add a try: ecept: and skip that shit
14:44 🔗 Hydriz lol
14:50 🔗 Hydriz hmm, thinking about it, there isn't really much I know that I can write in the blog post
14:58 🔗 emijrp at the begining it is hard to write
14:58 🔗 emijrp later we will need more pages
14:58 🔗 Hydriz lol
14:58 🔗 Hydriz 1 week left
14:58 🔗 Hydriz though
14:59 🔗 Hydriz anyway, can I start uploading the files for the Wikimedia Commons grab?
14:59 🔗 Hydriz the rest of them
15:00 🔗 emijrp if you can modify the items later and add the missing .zip ..
15:00 🔗 Hydriz yep
15:01 🔗 Hydriz but still I got to wait for the dvds to get uploaded
15:01 🔗 Hydriz why do people want to make DVDs of Wikipedia...
15:01 🔗 Hydriz zzz
15:01 🔗 emijrp dvs?
15:02 🔗 Hydriz http://dumps.wikimedia.org/dvd.html
15:06 🔗 emijrp because there are people without internet
15:06 🔗 emijrp CDPedia is the Spanish Wikipedia CD, it is useful for El Salvador and other South MAerican countries.
15:07 🔗 Hydriz I see
15:07 🔗 Hydriz yep, its on the IA
15:07 🔗 Hydriz I am just left with dewiki
15:07 🔗 Hydriz 2 more files
15:29 🔗 Hydriz Right, good night people
15:30 🔗 Hydriz got to sleep for long day tomorrow :)

irclogger-viewer