[00:07] *** BlueMaxim has quit IRC (Leaving) [00:07] *** BlueMaxim has joined #archiveteam-bs [00:35] anybody running the seagate shingle drives? [00:35] 8Tb for $200 isn't bad [00:35] and i have a pretty high write-one, read-many scenario [00:52] *** dashcloud has joined #archiveteam-bs [00:58] *** wabu has quit IRC (Read error: Operation timed out) [00:58] *** espes__ has quit IRC (Read error: Operation timed out) [00:59] *** chazchaz has quit IRC (Read error: Operation timed out) [01:04] shingle drives - yuck [01:04] might as well shuck those wd easystores from bestbuy and save yourself $40 or more [01:17] *** wabu has joined #archiveteam-bs [01:17] *** Valentine has quit IRC (Ping timeout: 506 seconds) [01:18] *** chazchaz has joined #archiveteam-bs [01:19] *** espes__ has joined #archiveteam-bs [01:23] are they all reds? [01:28] i mean, knowing my use case, not sure it's bad [01:28] a 20Gb cache onboard, which i won't fill up immediately [01:31] i'm making a audio collection of Charlie Rose episodes i have [01:32] its for the myspleen people but could be uploaded as a way for a much smaller data collection of Charlie Rose [01:32] oh, a fellow ms user! [01:40] SketchCow: is there anyway to make all the podcasts collection files downloadable : https://archive.org/details/The_Jim_Rome_Show_Podcast-2005-01-03 [01:40] there downloadable for me only [01:41] so everyone else just sees broken items with no files [01:54] so think a Charlie Rose mp3 collection would be like 250gb vs 2.5tb of video [01:54] https://archive.org/details/apkarchive [01:54] https://archive.org/details/ipaarchive building [01:54] https://archive.org/details/cdromsoftware discovered CD-ROMs in uploads that have graphics attached for seeing what the item is [01:54] also metadata is in the mp3 files [01:54] https://archive.org/details/cdromimages discovered CD-ROMs in uploads (or in previous cd-rom collections) without graphics attached [02:01] *** Valentine has joined #archiveteam-bs [02:03] SketchCow: don't upload my video captures yet [02:04] i'm having trouble with one file right now [02:04] upload speed of FOS is very slow for some reason on my end [02:12] SketchCow: so remember like 6mo ago when you wanted an ftp.prserv.net archive? i downloaded most of it but wonder what the preferred format is for splitting things into smaller archive chunks and/or providing a file listing [02:13] zipping it all and uploading it all as one chunk is much like trying to fit an entire redwood tree into a woodchipper [02:18] *** Valentine has quit IRC (Ping timeout: 506 seconds) [02:29] *** jdude104 has joined #archiveteam-bs [02:39] *** Odd0002 has quit IRC (Ping timeout: 260 seconds) [02:42] I've done that before [02:42] But how big is it [02:44] *** Odd0002 has joined #archiveteam-bs [03:05] *** jdude104 has quit IRC (Read error: Operation timed out) [03:06] *** jdude104 has joined #archiveteam-bs [03:10] SketchCow: I forget off the top of my head but I think dozens of gigs (and I'm on DSL) [03:11] More than 5 and less than 80 [03:12] I was thinking about zipping each group of folders separately to get a rough split that doesn't require people to download the whole thing if they just want one bit [03:13] *** godane has quit IRC (Read error: Operation timed out) [03:30] 80G is fine to upload all as one thing, but dividing it up would be OK too, IMO (but I'm nothing like an authority) [03:31] If Hiccup shows up again, point them in the direction of the IA Census. [03:40] *** godane has joined #archiveteam-bs [03:41] try to keep items to 5 gigs, but you won't start running into issues until about 100g [03:54] astrid: thanks, i think it's probably my side having issues more than anything. so i think i need to split it up but i wasn't sure if there was an archive/split/metadata format that was better than, say, a big .txt file with a recursive `ls` dump, etc [04:12] *** dashcloud has quit IRC (Ping timeout: 252 seconds) [04:12] *** dashcloud has joined #archiveteam-bs [04:21] *** dashcloud has quit IRC (Remote host closed the connection) [04:38] *** Valentine has joined #archiveteam-bs [04:47] *** qw3rty17 has joined #archiveteam-bs [04:51] *** qw3rty16 has quit IRC (Read error: Operation timed out) [05:00] *** Valentine has quit IRC (Ping timeout: 506 seconds) [05:15] *** Valentine has joined #archiveteam-bs [05:27] *** octothorp has quit IRC (Read error: Operation timed out) [05:32] *** octothorp has joined #archiveteam-bs [05:32] *** Valentine has quit IRC (Ping timeout: 506 seconds) [05:33] hey, I have a dumb question [05:33] how do you generate the warc record ID? [05:35] *** Valentine has joined #archiveteam-bs [05:43] *** Valentine has quit IRC (Ping timeout: 506 seconds) [05:45] *** jdude104 has quit IRC (Read error: Operation timed out) [05:52] *** Valentine has joined #archiveteam-bs [06:00] I need it for the next update of my chrome extension, because liveweb can't capture past robots.txt [06:06] check the spec [06:26] *** ReimuHaku has quit IRC (Ping timeout: 250 seconds) [06:28] *** ReimuHaku has joined #archiveteam-bs [06:47] I did [06:49] *** Valentine has quit IRC (Ping timeout: 506 seconds) [06:52] *** Valentine has joined #archiveteam-bs [07:03] *** Pixi has quit IRC (Quit: Pixi) [07:03] *** Pixi has joined #archiveteam-bs [07:15] SketchCow: did you mean to add random images into https://archive.org/details/archiveteam_newssites items? [08:42] *** tomaspark has quit IRC (Remote host closed the connection) [09:02] atrocity: For a read-heavy environment, shingled drives seem pretty much perfect. I haven't used them yet, but I plan to do so soon (waiting for the price to come down a bit more on this side of the pond). Regarding WD Red vs. white-labeled: the two are essentially identical except for the PWDIS feature in some of the white drives (which is easily fixable with a bit of tape if your machine doesn't sup [09:03] port it). [09:03] Heh, I remember now that we discuseed this before. [09:16] *** tuluu has quit IRC (Ping timeout: 740 seconds) [09:16] *** tuluu has joined #archiveteam-bs [09:21] *** Mateon1 has quit IRC (Read error: Operation timed out) [09:22] *** Mateon1 has joined #archiveteam-bs [09:45] *** ranavalon has joined #archiveteam-bs [09:52] *** AeonG__ has quit IRC (Ping timeout: 633 seconds) [09:58] *** BlueMaxim has quit IRC (Leaving) [10:00] *** REiN^ has quit IRC (Remote host closed the connection) [10:16] *** icedice has joined #archiveteam-bs [10:36] *** altlabel has quit IRC (Read error: Operation timed out) [10:41] *** ZexaronS has quit IRC (Quit: Leaving) [12:08] *** bwn has quit IRC (Read error: Operation timed out) [12:09] *** odemg has quit IRC (Leaving) [13:02] *** icedice has quit IRC (Ping timeout: 505 seconds) [13:09] *** kevinr has quit IRC (Ping timeout: 250 seconds) [13:10] *** kevinr has joined #archiveteam-bs [13:12] *** odemg has joined #archiveteam-bs [14:07] afternoon all [14:07] * Jon pimps SketchCow's podcast on his blog [14:25] archiveteam is lacking a third piece of the triforce [14:26] irc for short term discussion, wiki for long-term information storage, where's the medium term [14:26] we're lacking a forum [14:26] channel offshoots like #msgbored with not much activity will never get anywhere without a place to discuss ideas where they won't disappear [14:30] *** REiN^ has joined #archiveteam-bs [14:39] HCross2: Yes [14:40] Oh god not a forum [14:41] Ah thanks. I looked this morning and was confused [14:52] Trying to make the collections prettier and more defined. [14:52] Add descriptions to the collections that are clearer. Could use help. [14:56] Sanqui, heh, mailing list? [14:56] or a usenet server. [14:56] I'd love to have to use usenet again. [14:56] that might do but I sorta lack the tools, my email handling is horrible [14:57] I should get my thunderbird up and running again [14:57] modern versions of MailMan apparently have a web front-end for archives which is nice enough to fool people into thinking they are actually using a forum [14:57] although I haven't seen it deployed somewhere for real, which is telling. [14:58] do people still use thunderbird [14:58] yeah [14:58] oki [14:58] best of a bad bunch I think [14:58] i do like gmail's mail grouping [15:02] Oh god not a mailing list [15:05] Hey, Archivebot has not shoved a new batch into the archive in what looks like almost a day. [15:06] This is actually good news [15:07] SketchCow, so I take that as tacit approval of a usenet group? :P [15:07] oh god not a usenet group [15:07] :> [15:07] * Jon , despite having been a UNIX sysadmin in a prior life, wouldn't know where to begin with that [15:07] I do [15:07] But I will not do it [15:07] So, look. [15:08] There's a larger issue at scale. The easier we make it for everyone to hop into Archive Team on a "just stopping in" basis, the more we end up with [15:08] MELLONCHOLY: Hi guyz!!!! [15:08] MELLONCHOLY: Sure do hate things going away [15:08] MELLONCHOLY: I want to help [15:08] MELLONCHOLY: I read the docs [15:08] MELLONCHOLY: I have submitted 409gb of warc grabs [15:09] MELLONCHOLY: Ooops, done wrong [15:09] MELLONCHOLY: Anyone here like pokemon [15:09] MELLONCHOLY: I found a rare one [15:09] lol [15:09] MELLONCHOLY: Also, I have added a bunch of new things to the wiki [15:09] MELLONCHOLY: Why r u all so mean [15:10] To get back to Sanqui's point, tbh I think the wiki would be fine for forum-like stuff , i.e. Talk: pages [15:10] I think if there's stuff being discussed, the Wiki is best now, now that jrwr has really spiffed things up [15:11] talk pages are pretty bad, but I agree that there's no clearly good solution [15:11] Talk pages are just fine [15:11] You just need discipline. [15:11] Speaking of which, I have over 40 windows open on this desktop doing archiving [15:12] \o/ [15:12] point of order. For archiving a Blu Ray, which is legitimately archive-able (Creative Commons) *but* has AACS DRM; it makes sense to archive after decrypting right? [15:12] the legal risk is in the process of decrypting; not receipt of post-decrypted stuff. [15:13] I need to double check the decryption process is precise and there's not a fidelity issue too [15:13] FOS has two partitions. One is at 6%, one is at 61%. The 61% is basically my stuff so I'm trying to nail it down. (It's "Just" 2tb of stuff) [15:13] Ha ha legal risk [15:13] well I mean DMCA or whatever [15:13] I'll visit you in Blu-Ray Jail [15:13] kind offer [15:13] I'll bring a cake with a file [15:14] this is probably why the existing BD rip on archive.org I'm aware of was darkened [15:15] Well, if it was something recent and obvious, I'm sure a bot found it [15:15] it's http://archive.org/details/NineInchNailsGhostsI-Ivblu-ray24bit96khz but I've already mailed info@ to discuss it, so not asking you to do anything here, just if you were curious [15:15] CC-BY-SA-NC [15:16] It was darked 3.1 years ago [15:17] There, I undarked it [15:17] yeah I'm slow to get back to this, I backburnered looking into it near the time [15:17] oh wow thanks [15:23] *** ranavalon has quit IRC (Read error: Connection reset by peer) [15:25] *** ranavalon has joined #archiveteam-bs [15:29] yeah I think thunderbird is just frozen trying to download my gmail inbox. [15:31] You should see that in the status bar near the bottom. "Downloading header x of n" or something like that. [15:33] I forgot to mention that of those 40 screens, something like 15 of them are running an analytic process against archive.org stacks of materials to find cases where I unwittingly broke things and to unbreak them [15:33] Just a huge fucking cleanup crew [15:35] is it satisfying to have the machine(s) doing work, though, right? getting value's worth from the silicon [15:35] Well, I'm not sad or anything [15:36] Oh, National Archives space uploads: https://www.kickstarter.com/projects/420606009/fight-for-space-space-program-and-nasa-documentary/posts/2089150?ref=backer_project_update => https://archive.org/details/@paul_hildebrandt [15:36] I'm back to no hours in the day currently. [15:37] *** Stilett0 has quit IRC (Ping timeout: 260 seconds) [15:37] SketchCow, maybe ignore doing some things. No hours left might be bad health-wise. [15:37] ha ha [15:38] The gal keeps on top of me [15:38] I went to bed at midnight like a prole last night [15:38] :-) [15:38] Like a rube, a huckleberry, a member of the masses without classes [15:41] One window is fixing the covers of the zines section [16:07] *** Valentine has quit IRC (Ping timeout: 506 seconds) [16:25] SketchCow: there is a addon I can add in that adds forum like threads to talk pages [16:25] they work like reddit comments so threaded [16:51] *** Stilett0 has joined #archiveteam-bs [17:16] *** schbirid has joined #archiveteam-bs [17:25] *** schbirid has quit IRC (Ping timeout: 252 seconds) [17:27] *** Harzilein has quit IRC (Quit: ircII EPIC5-2.0.1 -- Are we there yet?) [17:29] *** schbirid has joined #archiveteam-bs [17:29] *** C4K3 has quit IRC (Read error: Connection reset by peer) [17:30] *** C4K3 has joined #archiveteam-bs [17:37] SketchCow: Can I create new collections with these new archive.org superpowers? The NIN remixes would deserve one, imo: https://archive.org/search.php?query=subject%3Aremix.nin.com%20mediatype%3Aaudio%20subject%3Aarchiveteam [18:51] *** jdude104 has joined #archiveteam-bs [19:02] Who is DKL3. [19:24] *** medowar has quit IRC (Ping timeout: 252 seconds) [19:24] *** purplebot has quit IRC (Ping timeout: 252 seconds) [19:25] *** HCross2 has quit IRC (Ping timeout: 252 seconds) [19:25] *** odemg has quit IRC (Ping timeout: 252 seconds) [19:25] *** Rai-chan has quit IRC (Ping timeout: 252 seconds) [19:27] *** odemg has joined #archiveteam-bs [19:28] *** purplebot has joined #archiveteam-bs [19:28] *** medowar has joined #archiveteam-bs [19:29] *** HCross2 has joined #archiveteam-bs [19:29] *** svchost03 sets mode: +o HCross2 [19:39] *** Rai-chan has joined #archiveteam-bs [19:44] *** ndiddy has quit IRC () [19:44] *** ndiddy has joined #archiveteam-bs [20:01] *** espes__ has quit IRC (Read error: Operation timed out) [20:09] Stand down, tracked, thanks [20:20] Somebody2: I'm wondering if you ever got a response back on the email you sent to Norsk Dataforening? (i apologize if i mistake you for another user here) [20:36] *** espes__ has joined #archiveteam-bs [20:53] *** icedice has joined #archiveteam-bs [20:58] *** Kimmer has joined #archiveteam-bs [21:02] *** qw3rty17 has quit IRC (Nettalk6 - www.ntalk.de) [21:10] *** icedice has quit IRC (Quit: Leaving) [21:42] *** Mateon1 has quit IRC (Read error: Operation timed out) [21:42] *** Mateon1 has joined #archiveteam-bs [22:02] *** Jusque has quit IRC (ZNC - http://znc.in) [22:10] *** godane has quit IRC (Read error: Operation timed out) [22:11] *** Jusque has joined #archiveteam-bs [22:14] *** Jusque has quit IRC (Client Quit) [22:15] *** Jusque has joined #archiveteam-bs [22:38] ola_norsk: Nope, no response yet. [22:43] Somebody2: aye, though there might have been some reorginazation going on there in the few weeks. Torp now seems to be listed as "general secretary".. http://www.dataforeningen.no/ansatte.134521.no.html ...It's typical fucking beurocracy [22:45] Eh, like I said -- my message was pretty much focused on "you don't need to reply to this" -- so not getting a reply just means they understood it. :-) [22:45] *** godane has joined #archiveteam-bs [22:45] a "thank for your input" would've been courteous though [22:53] *** godane has quit IRC (Quit: Leaving.) [22:53] *** godane has joined #archiveteam-bs [22:59] "Wanting people to listen, you can't just tap them on the shoulder anymore. You have to hit them with a sledgehammer, and then you'll notice you've got their strict attention." [22:59] ..or an open letter, signed by academics [23:00] *** ola_norsk has left [23:09] i'm uploading tons of odd recordings i have [23:09] this is so i can get rid of them once jason uploads them to archive.org [23:10] one of the odd ones is a recording of WMUR News9 At 11 on 1997-02-23 [23:10] SketchCow: your getting a old local news recording from my personal tape collection [23:23] *** qw3rty15 has joined #archiveteam-bs