[00:22] SketchCow: so i'm looking at artistserver.com [00:24] ok i figured out how to grab all the music [00:24] Do it. [00:30] wget http://$website/music.cfm --mirror --warc-file=$website-music-index-$(date +%Y%m%d) --warc-cdx --accept-regex='(startrow)' -E -o wget.log [00:30] artistserver.com/music.cfm has a index [00:30] and the download links are on the index [00:34] and code to get mp3s from web archive: [00:34] zcat *.warc.gz | grep mp3 | sed 's|.*href="||g' | sed 's|".*||g' | grep media.artistserver.com [02:29] SketchCow: https://archive.org/details/artistserver.com_artist_Sped_Heller [02:30] i just uploaded the first band in the mp3 list [04:37] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [04:37] yahoosucks [09:09] yay, im back [09:09] and not fired [09:09] hooraay for backups and such [11:29] hey guise [11:30] why is the archive script just giving me "No item received. Retrying after 30 seconds..." [11:59] Which project are you running it on? [11:59] twitch is currently on hold with no items left to hand out. [12:00] swipnet has 48.5m items left to do [12:30] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [12:30] yahoosucks [12:30] how is that a secret [12:30] its not anymore! [15:35] is is better to upload an epub or a pdf if I have both for the same item? [15:39] why not both? [15:46] does epub not derive into other formats? [16:01] epub doesn't derive afaik, but yeah upload both [16:04] my understanding is because it is all text based, it is easier to convert epub. But pdf is more widely supported. [16:04] so, is my item supposed to auto-derive from pdf to other formats, or do I need to do something special? [16:06] I think it should derrive automatically, but I'd say that you're still better off to upload the epub that you have, because while the deriver does a pretty admirable job, the one you have is almost certainly better. [16:10] it'll auto-derive unless you add the file after creating the item, but you can still kick off a derive manually [17:22] WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD [17:23] yahoosucks [17:23] Thanks [17:23] what was that all about. [17:24] new user account for the archiveteam website [17:24] oh [17:24] I should make one of those [17:25] underscor: fix your tubes [21:23] Going back to the self storeage thing, I have a little donation $ for building one of these :) https://www.backblaze.com/blog/why-now-is-the-time-for-backblaze-to-build-a-270-tb-storage-pod/ [22:30] So i just heard of you guys becouse of Swipnet being shut down [22:31] so, does anyone here been in contact with them. Becouse their side is already down [22:31] if you want to get involved see #swiped [22:32] aaaaaaaaa, tnx [22:34] aaaaaaaaa, as you are online. Does Archiveteam collabirate with other archive organizations? such as archive.org [22:36] U23: Most of the stuff gets pushed up to archive.org [22:36] (all?) [22:36] thats great [22:36] they have an informal relationship [22:36] :) [22:36] I think the trend had been all but twitch my change that [22:37] They have a lot of data from twitch [22:38] oh come on, 35TB isn't that much ;-) [22:39] its almost 100TB [22:40] Although the people in charge would have a better idea, I think there was some duplicate grabs and some that might get deleted as someone else got it. [22:41] :) [22:43] All I know is that I'm glad I don't have to pay to store it. [22:44] Kenshin is being a saint. [22:44] ++ [22:45] i dont really get the twitch video archive, there should be a better candidate for archiving. I guess their is some sort of a prio list somewhere [22:45] official or unofficial [22:45] oh well, "archive everything" is the main thing anyway [22:46] http://archiveteam.org/index.php?title=Twitch#What_we_are_saving [22:47] aaaaaaaaa, i mean the bigger picture. "website x > website y" [22:47] but still a nice link tho, <3 twitchplayspokemon hahaha [23:53] Any chance I could get access to the call_for_help_canada collection? I got a dozen episodes from Jenn Cutter (who was on the show a bit) that aren't on IA yet.