[07:38] I'm at 95 wikis downloaded with the new script [07:39] And no way to get it consume a good amount of bandwitdh, perhaps I should go with 50 threads. [07:39] underscor, you should run at least 100 for 1000 wikis. ;) [08:29] Um... How do I claim a list? [08:59] mutoso, I suppose you don't have access to our wiki, so just tell me [08:59] Yeah, I don't. [08:59] I started list11. [09:00] ok [09:00] mutoso, did you split the list? [09:00] jobs are very long but not resource-intensive [09:00] I mean split -l 10 list011 list011 [09:00] Oh. I see. No. I didn't. [09:01] I suggest you to to do so. [09:01] Also, how do we contact you for updates? Are you on our mailing list? [09:02] mutoso, I have to go now, join http://groups.google.com/group/wikiteam-discuss if you didn't yet. Thank you! [09:09] Alright. I split the list and joined the mailing list. I'm heading to bed, but I'll be idleing in here. Feel free to ping me. [14:48] Nemo_bis: I've been doing 10 threads of 10 [14:48] :P [16:25] underscor, slacker [16:55] hahahah [17:05] underscor, I'm running 60+ instances on my desktop and I have only an AMD E-350 ! [17:06] 208 wikis downloaded so far [17:18] come on, underscor, launch some dozens of instances :) [20:29] emijrp, I'm at 223 wikis archived. [20:30] And I've consumed half of my free disk space, so either you create a script for the upload soon or I'll have to do some aggressive disk cleanup. [20:33] or you get a life [20:35] I can't create a script to upload. That is the usual s3 or form upload at Internet Archive. [20:37] get a life? scripts are doing everything [20:37] Well, you wrote to wait for instructions about upload... [20:38] yes, we have to decide a prodecure [20:38] procedure [20:38] The script is to check what dumps have been completed, fetch the URL from the config file, take the name and license of the wiki from its API, upload with metadata and delete files. [20:38] It's quite a lot of paperwork otherwise. [20:39] ok, that script is easy [20:39] yeah [20:39] i hope other guy step forward, and code some lines [20:39] what about you, underscor :-D [20:51] Nemo_bis: run: wc -l */*titles.txt [20:51] post the sum [20:52] 2582906 [20:52] biggest wikis still running though [20:52] Nemo_bis: I will! [20:52] underscor, \o/ [20:52] I'm working on an automated interface for it [20:52] because keeping track of 60 shells is a pain in the ass [20:52] heh [20:53] well, that's another issue; now that launcher.py resumes all uncompleted jobs, I just run them in a detached screen and forget about them, then rerun [20:53] for the upload, remember to follow the correct format :p https://code.google.com/p/wikiteam/wiki/NewTutorial#Publishing_the_dump [20:56] you can work in a launcher for the launcher which launch launcher.py [20:56] 3 fucking levels of downloading and hoarding wikis. Inception. [20:57] :D [21:01] 60 shells? you have claimed just a list underscor [21:15] underscor, even better, put also the api.php URL in the metadata, invent something or abuse an existing field :p [21:15] emijrp, did you see the new issues I filed? :) [21:24] more? [21:24] lol no, thank [21:24] what a nightmare [21:32] :D [21:33] the issue about redirects should be easy to solve, probably just some flag in urllib or something [21:36] seeya [22:13] 60 shells? you have claimed just a list underscor [22:14] I was just commenting on Nemo_bis's thing about 60 simultaneous [22:14] oh, he left [22:14] dammit