[10:40] <underscor> http://archive.org/about/dmca.php I had no idea this was a thing
[10:40] <underscor> \o/
[16:07] <godane> i got the blazetv doc called The Project
[16:59] <dashcloud> for wget warc do I need to include a header?
[17:06] <dashcloud> apparently you can't use two separate warc headers- everything has to be combined into a single --warc-header command
[17:58] <alard> dashcloud: You should be able to use multiple --warc-header options.
[18:00] <alard> You could do something like    wget --warc-header="operator: Archive Team" --warc-header="x-something-else: value"
[18:01] <alard> You can use any header you want, as long as it follows the name: value format. The headers will be stored in the warc-info record at the top of the warc file.
[18:03] <dashcloud> ah- that explains it
[18:03] <dashcloud> I didn't have a colon in the second header command
[18:05] <dashcloud> so how much should I set recursion to in order to avoid infinite loops?
[18:40] <alard> I'm not sure if Wget checks the headers, it might just copy the strings.
[18:40] <alard> Recursion, well, that depends on what you're doing, I guess.
[18:42] <alard> It can be lower for very shallow sites, but must be high for sites with a deep structure. You could also set try to ignore the looping urls with one of the ignore options.
[18:52] <dashcloud> thanks
[19:31] <dashcloud> hi folks, I did a basic grab of touchatag.com using these settings: http://pastebin.com/nzSnPfz7 and it would be great if someone could double check it- I appear to have missed this page: http://www.touchatag.com/downloads and I'm not quite sure how
[20:00] <alard> dashcloud: I'm getting http://www.touchatag.com/downloads , so no idea what's wrong. (You might want to add --page-requisites, but that's something else.)
[21:01] <dashcloud> thanks!