#archiveteam 2014-08-20,Wed

↑back Search

Time Nickname Message
00:22 🔗 godane SketchCow: so i'm looking at artistserver.com
00:24 🔗 godane ok i figured out how to grab all the music
00:24 🔗 SketchCow Do it.
00:30 🔗 godane wget http://$website/music.cfm --mirror --warc-file=$website-music-index-$(date +%Y%m%d) --warc-cdx --accept-regex='(startrow)' -E -o wget.log
00:30 🔗 godane artistserver.com/music.cfm has a index
00:30 🔗 godane and the download links are on the index
00:34 🔗 godane and code to get mp3s from web archive:
00:34 🔗 godane zcat *.warc.gz | grep mp3 | sed 's|.*href="||g' | sed 's|".*||g' | grep media.artistserver.com
02:29 🔗 godane SketchCow: https://archive.org/details/artistserver.com_artist_Sped_Heller
02:30 🔗 godane i just uploaded the first band in the mp3 list
04:37 🔗 Aztecguy WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
04:37 🔗 BiggieJon yahoosucks
09:09 🔗 midas yay, im back
09:09 🔗 midas and not fired
09:09 🔗 midas hooraay for backups and such
11:29 🔗 meg hey guise
11:30 🔗 meg why is the archive script just giving me "No item received. Retrying after 30 seconds..."
11:59 🔗 vantec Which project are you running it on?
11:59 🔗 vantec twitch is currently on hold with no items left to hand out.
12:00 🔗 vantec swipnet has 48.5m items left to do
12:30 🔗 JD__ WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
12:30 🔗 Cameron_D yahoosucks
12:30 🔗 JD__ how is that a secret
12:30 🔗 Cameron_D its not anymore!
15:35 🔗 dashcloud is is better to upload an epub or a pdf if I have both for the same item?
15:39 🔗 SadDM why not both?
15:46 🔗 dashcloud does epub not derive into other formats?
16:01 🔗 DFJustin epub doesn't derive afaik, but yeah upload both
16:04 🔗 aaaaaaaaa my understanding is because it is all text based, it is easier to convert epub. But pdf is more widely supported.
16:04 🔗 dashcloud so, is my item supposed to auto-derive from pdf to other formats, or do I need to do something special?
16:06 🔗 SadDM I think it should derrive automatically, but I'd say that you're still better off to upload the epub that you have, because while the deriver does a pretty admirable job, the one you have is almost certainly better.
16:10 🔗 DFJustin it'll auto-derive unless you add the file after creating the item, but you can still kick off a derive manually
17:22 🔗 Revved WHAT FORSOOTH, PRITHEE TELL ME THE SECRET WORD
17:23 🔗 BiggieJon yahoosucks
17:23 🔗 Revved Thanks
17:23 🔗 avuserow what was that all about.
17:24 🔗 BiggieJon new user account for the archiveteam website
17:24 🔗 avuserow oh
17:24 🔗 avuserow I should make one of those
17:25 🔗 xmc underscor: fix your tubes
21:23 🔗 matthusb_ Going back to the self storeage thing, I have a little donation $ for building one of these :) https://www.backblaze.com/blog/why-now-is-the-time-for-backblaze-to-build-a-270-tb-storage-pod/
22:30 🔗 U23 So i just heard of you guys becouse of Swipnet being shut down
22:31 🔗 U23 so, does anyone here been in contact with them. Becouse their side is already down
22:31 🔗 aaaaaaaaa if you want to get involved see #swiped
22:32 🔗 U23 aaaaaaaaa, tnx
22:34 🔗 U23 aaaaaaaaa, as you are online. Does Archiveteam collabirate with other archive organizations? such as archive.org
22:36 🔗 matthusb_ U23: Most of the stuff gets pushed up to archive.org
22:36 🔗 matthusb_ (all?)
22:36 🔗 U23 thats great
22:36 🔗 aaaaaaaaa they have an informal relationship
22:36 🔗 U23 :)
22:36 🔗 aaaaaaaaa I think the trend had been all but twitch my change that
22:37 🔗 aaaaaaaaa They have a lot of data from twitch
22:38 🔗 matthusb_ oh come on, 35TB isn't that much ;-)
22:39 🔗 aaaaaaaaa its almost 100TB
22:40 🔗 aaaaaaaaa Although the people in charge would have a better idea, I think there was some duplicate grabs and some that might get deleted as someone else got it.
22:41 🔗 matthusb_ :)
22:43 🔗 aaaaaaaaa All I know is that I'm glad I don't have to pay to store it.
22:44 🔗 aaaaaaaaa Kenshin is being a saint.
22:44 🔗 matthusb_ ++
22:45 🔗 U23 i dont really get the twitch video archive, there should be a better candidate for archiving. I guess their is some sort of a prio list somewhere
22:45 🔗 U23 official or unofficial
22:45 🔗 U23 oh well, "archive everything" is the main thing anyway
22:46 🔗 aaaaaaaaa http://archiveteam.org/index.php?title=Twitch#What_we_are_saving
22:47 🔗 U23 aaaaaaaaa, i mean the bigger picture. "website x > website y"
22:47 🔗 U23 but still a nice link tho, <3 twitchplayspokemon hahaha
23:53 🔗 Famicoman Any chance I could get access to the call_for_help_canada collection? I got a dozen episodes from Jenn Cutter (who was on the show a bit) that aren't on IA yet.

irclogger-viewer