[19:06] emijrp: finally caught you [19:06] so, I'm working with ariel glenn (from wmf) and kevin day (your.org) on getting the stuff up to ia [19:06] Just thought you'd like to know [19:07] ariel is going to generate the media xml for me, and then I can parse it with lxml or hpricot to extract the bits I need [19:08] and the your.org guys are giving me a box with a nfs mount of the tree, so I can do easy filtering [19:08] ^ Nemo_bis too [19:09] great [19:09] are you going to upload day-by-day packs to wikiteam collection? [19:11] still figuring out what interval to do it at [19:11] IA wants roughly 10-15GB sized archives [19:11] So for some projects (like the Esperanto Wiktionary), all of their media combined is like 2MB [19:11] lol [19:11] But commons easily sees 10gb in a day [19:12] (according to ariel) [19:12] so different projects'll see different rates [19:12] The goal is to make it all fully automated [19:12] Because right now, even the wiki backups from WMF are semi-manual [19:12] and that's tedius [19:12] tedious*