#wikiteam 2012-04-04,Wed

↑back Search

Time Nickname Message
19:06 🔗 underscor emijrp: finally caught you
19:06 🔗 underscor so, I'm working with ariel glenn (from wmf) and kevin day (your.org) on getting the stuff up to ia
19:06 🔗 underscor Just thought you'd like to know
19:07 🔗 underscor ariel is going to generate the media xml for me, and then I can parse it with lxml or hpricot to extract the bits I need
19:08 🔗 underscor and the your.org guys are giving me a box with a nfs mount of the tree, so I can do easy filtering
19:08 🔗 underscor ^ Nemo_bis too
19:09 🔗 emijrp great
19:09 🔗 emijrp are you going to upload day-by-day packs to wikiteam collection?
19:11 🔗 underscor still figuring out what interval to do it at
19:11 🔗 underscor IA wants roughly 10-15GB sized archives
19:11 🔗 underscor So for some projects (like the Esperanto Wiktionary), all of their media combined is like 2MB
19:11 🔗 underscor lol
19:11 🔗 underscor But commons easily sees 10gb in a day
19:12 🔗 underscor (according to ariel)
19:12 🔗 underscor so different projects'll see different rates
19:12 🔗 underscor The goal is to make it all fully automated
19:12 🔗 underscor Because right now, even the wiki backups from WMF are semi-manual
19:12 🔗 underscor and that's tedius
19:12 🔗 underscor tedious*

irclogger-viewer