[13:55] Eh, emijrp, are you free now? [13:55] depend [13:55] lul [13:55] okok, just a short one [13:55] is there going to be anything done to the Wikimedia Commons grab? [13:56] i reported some bugs, but they are not fixed (in the same way Nemo_bis report bugs to me and i dont fix them; KARMA RETURNS) [13:56] heh [13:56] do you have downloaded many GB? [13:56] lol [13:56] anyways, I finished downloading [13:57] eh, about 120GB or so [13:57] same here, but I deleted everything now [13:57] to make room for the wikis [13:57] some month grabs looks good, so I transferred them to the IA already [13:57] hmmm [13:57] Hydriz: cool [13:57] weren't we supposed o wait [13:57] http://archive.org/details/wikimediacommons-200606 [13:57] heh Nemo [13:57] rules are meant to be broken :P [13:58] oic [13:58] I upload June + January 2006 [13:58] id you include the output of the checker? [13:58] (or the log, I don't remember what's there) [13:58] lol no [13:59] I am clearing stuff off the Labs project [13:59] i think it is ok you upload whatever you have about Commons, that MediaWiki developers are not going to solve a damn [13:59] so, upload [14:00] perhaps it contains some broken images, but, better than nothing [14:00] we want moar [14:00] yeah, around 10 - 15 per day [14:00] * broken images [14:01] actually some errors were fixed [14:01] but issue 45 is the burning issue [14:02] its preventing many days to be grabbed [14:02] so I am putting them on hold before I upload [14:02] or maybe I should upload... [14:03] no, wait [14:03] its not affecting other days though [14:03] the only months unaffected by issue 45 are january and june? [14:03] yep [14:04] miracle [14:04] but a month that is affected, is only isolated to the few days [14:04] yeah, some encoding issue that commonsdownloader.py refuses to resolve [14:04] like slashes or other symbols [14:07] I probably can start on July - December soon, and then we can put pressure to make more commonssql.csv s [14:07] looks like the bug only affects to old versions [14:07] but i will try to fix it anyway [14:07] heh [14:07] but it shouldn't be of top priority anyway [14:08] can you paste the wget call ? [14:09] https://code.google.com/p/wikiteam/issues/detail?id=45 [14:09] wha..what? [14:09] just before wget stat [14:09] starts [14:09] lol [14:09] * Hydriz shall start the script again [14:10] it skips to the last downlaoding image [14:10] right'? [14:10] i dont remember.. [14:11] right, give me a few minutes [14:11] (or give the script a few more minutes) [14:12] just donwload 2006-02-05 [14:12] yep [14:12] Doing... [14:13] the issue is that wget saves it like 2006/02/05/20070605200920!US__reverse.jpg but the eral name is 2006/02/05/20070605200920!US_$100_reverse.jpg [14:13] i dont know if wget eats the $, or ... [14:14] If I recall vaguely, its the downloader that is eating it, or something [14:14] but anyway, our taskforce seems to be going well? [14:14] the nemo dominance [14:17] ok [14:17] about the metadata of items [14:17] we need to add the ZIP links to explore the images [14:18] and a link back to WikiTeam Google Code [14:18] thats mad [14:18] link, yes [14:18] but ZIP links, 31 times... [14:19] yes, that is easy, copy paste or a tiny script [14:19] to generate a cool HTML table [14:19] * Hydriz is feeling lazy right now... [14:21] wait wait [14:21] the wget call? [14:21] isn't it already in the paste inside my comment? [14:22] unless you meant a line above that [14:22] which is just the file name [14:23] yes, a line above [14:23] damn [14:24] a small oversight [14:29] when you paste that line (i hope it is shown and not hidden inside the os.system() call), i will check [14:29] i can add a try: except: too and skip that error [14:29] it looks like only affects to old versions [14:30] maybe... [14:30] but thats all the errors I got [14:30] 8 times [14:30] 8 times where? [14:31] ah ok [14:31] means that this bug affected the grab of 8 days [14:31] okok [14:31] not relevant for the big picture [14:32] 18TB of images and fails 8 images [14:32] lol [14:32] MAN. [14:32] well, really 1 or 2 pictures by day [14:32] hmm, lemme look at the IA blog post doc... [14:33] oh, i forgot to add a line to that post about the commons download [14:34] add it [14:34] * Hydriz is stunned about what to do [14:35] a comment about we have to skeap about the wikimedia commons downloader task [14:35] to the google doc [14:35] speak* [14:38] ah, the download is now in the old versions... [14:39] got it [14:39] I shall post it on the issue page [14:40] emijrp: ping [14:40] ok [14:41] yeah, it seems like its wget [14:42] weird, but ok [14:42] i will think about it, and, if i dont see a clear solution, i will just add a try: ecept: and skip that shit [14:44] lol [14:50] hmm, thinking about it, there isn't really much I know that I can write in the blog post [14:58] at the begining it is hard to write [14:58] later we will need more pages [14:58] lol [14:58] 1 week left [14:58] though [14:59] anyway, can I start uploading the files for the Wikimedia Commons grab? [14:59] the rest of them [15:00] if you can modify the items later and add the missing .zip .. [15:00] yep [15:01] but still I got to wait for the dvds to get uploaded [15:01] why do people want to make DVDs of Wikipedia... [15:01] zzz [15:01] dvs? [15:02] http://dumps.wikimedia.org/dvd.html [15:06] because there are people without internet [15:06] CDPedia is the Spanish Wikipedia CD, it is useful for El Salvador and other South MAerican countries. [15:07] I see [15:07] yep, its on the IA [15:07] I am just left with dewiki [15:07] 2 more files [15:29] Right, good night people [15:30] got to sleep for long day tomorrow :)