[07:21] <yipdw> so I've put up a first cut of code to grab *.patch.com sites at https://github.com/ArchiveTeam/patch-grab (warning: doesn't upload yet)
[07:22] <yipdw> I'm not convinced that one-site-per-work-item is a sane granularity, but on the other hand, I can't think of anything better
[07:22] <yipdw> feel free to hack on it, etc.
[07:23] <yipdw> http://quilt.at.ninjawedding.org/patchy has all patch.com sites identified by antomatic in http://archiveteam.org/index.php?title=List_of_Patch.com_sites
[07:25] <yipdw> I'm testing the downloader on a few work items right now -- it's a very basic wget --mirror, more or less -- so we'll see how complete that is
[09:50] <antomatic> yipdw: possibly each site could be further divided up into the similar subcategories - e.g. /directory, /news, /jobs, /blogs, /boards, etc - but going at it from the top level first of all seems like a good start to ensure nothing gets missed.
[10:11] <antomatic> good luck! :)
[23:16] <yipdw> antomatic: that sounds like a good idea; we can combine the individual sections later using megawarc etc
[23:27] <ATZ0> a highfive to you two. awesome that i can come in here and scream fire and you folks are running with it
[23:32] <yipdw> ATZ0: btw, I did get HTTP 420'd
[23:32] <yipdw> with --random-wait --wait 1
[23:32] <yipdw> evidently more waiting is required
[23:33] <yipdw> I'm not sure *when*, though, because I went to bed while the grab was running and I forgot to turn retry off on the pipeline