[15:28] Nemo_bis: as the deadline for lists.apple.com gets closer, I'd like to re-archive just the 2014 messages. is there an easy way to grab the index URLs for these from the wget log and feed them to wget-warc to produce an "update warc"? [15:28] right now it's still going... there's a lot of stuff here. [15:31] can't you just use wget patterns so that it rejects anything not from 2014? [15:31] I could but then it would re-crawl a bunch of stuff [15:31] I've only archived pipermail archives, not that custom kind [15:32] they're custom but pretty simple; it's all html based [15:32] lists.apple.com/archives/LISTNAME/year/month/msg#####.html [15:33] I probably could run a regex on the log looking for urls matching lists.apple.com/archives/*/2014/January/index.html [15:34] hmm then again [15:34] that would miss stuff if currently there aren't posts from 2014 and someone adds a post [15:34] your method would probably be better [23:05] DFJustin: That Chatnfiles FTP grab is going to be a month, I can feel it. [23:06] The bandwidth is essentially smoke signals and one of the indians is drunk [23:41] may be better off contacting the guy and working something out, there is a shoutout to you on the front page of chatnfiles.com [23:49] so it's not enemy territory