[00:34] DFJustin: freenode #datahoarder [00:35] DFJustin: should i go in there? [00:36] I'm lurking but it's not real active at the moment [00:44] lol [00:44] some of them have a nice storage system [00:44] rest, not that mucg [00:44] lots of storage is easy nowadays [00:45] /dev/mapper/storage-storage 23T 17T 6.0T 74% /storage [00:45] this system is 5 years old or something, 12x1T 12x1,5T drives [00:46] when the new 6TB drives are available it would be 132TB in one box [00:47] i saw some of their /r/datahoarder posts are about mirroring roms and stuff [00:49] http://www.reddit.com/r/DataHoarder/search?q=nightly+data+dump&restrict_sr=on [00:58] That guy is crazy. [00:58] I don't bother saving stuff and just assume crazy people like him exist that will share stuff. [00:59] crazy people on reddit, NO WAY ?!?!?!!? [00:59] This is definitely a first! [01:00] how much is in IA, about 10PB? [01:00] 15 now [01:00] What usage are they at? [01:01] that's data I think, I dunno what the actual drive capacity is [01:01] although usage might grow exponentially, luckily the storage capabilities also grow exponentially. So I think things work out. [01:22] we have ~45TB of spinning metal [01:22] ~16PB of unique data [01:23] gonna assume that's 45PB [01:23] whoops [01:23] yeah [01:24] the units all start to blur together [01:27] I assume there is some redundancy built in there, so at least 1/3 of the raw space is lost [08:14] anyone working on backing up the original nicktoons from vhs releases or from off-the-air recordings? IIRC a lot of that stuff got censored(ren and stimpy) and otherwise recut for longer commercial breaks even for the vhs releases [08:14] from what i gather a lot of the original studio tapes for that stuff got lost/destroyed [08:14] or what exists is the cut versions [08:14] so fully intact copies may only exist from recorded airings [08:47] reminds me of Star Wars. [09:03] i got radioshackcatalogs.com [09:03] at least everything but the hi-res images [09:03] i still have to grab those and make a video dump too [10:09] one of my favourite pictures, this is from a school camp: http://2.media.hyves-static.net/383885329/6/P7VZ/0/img383885329.jpeg [10:09] (currently in the process of manually archiving these...) [11:14] uploaded: https://archive.org/details/www.radioshackcatalogs.com-20131107 [12:28] i'm grabbing the radioshackcatalogs videos [12:29] i'm doing a sort of a blute force grab [12:29] cause some are video.flv and others are video.mp4 [12:53] Lord_Nigh: I won't be back until later tonight, but I do have a couple of tapes from Snick- their programming block on Friday nights. There's an episode of Ren & Stimpy on each. [13:19] i'm starting to upload my g4tv images again [13:19] just so it git-r-done [14:05] also i may get the original encoding of ancient prophecies [14:06] *also i may get the original encoding of ancient prophecies 4 [14:28] also i'm grabbing BEGIN Japanology series [14:28] i may get a good chuck of it too [14:37] also its in english too [15:39] godane: WGBH is working on a Machine that Changed the World project [15:40] Not going to be able to publish the whole series online, but they *are* going to be posting uncut interviews apparently [15:41] oh [15:41] cool [15:42] at least that makes more sense now [16:16] Anyone know how to download video from myspace? [16:30] FlashGot? [16:33] I'd say maybe DownloadHelper, but what exactly? [16:41] Checked JDownloader - no [16:41] balrog: video is https://myspace.com/goldtone/video/duelin-firemen-teaser/1010563 [16:42] oh, it's rtmp [16:42] and they're not obfuscating the params [16:43] Yeah, rtmpdump (via youtube-dl) produced a weirdly unplayable video though [16:50] youtube-dl just crashes here. [16:50] "ssl.SSLError: The read operation timed out" [16:55] Weird [16:55] getting timeouts with youtube-dl... [16:56] reinstalled, looks better now [16:56] mistym: the video plays fine in vlc [16:57] balrog: Hm, I'll try it again [17:11] godane; http://piratereverse.info/torrent/6768624/BBC_Horizon_Mega_Collection__(293_Episodes)#filelistContainer [17:24] can someone fix this item: http://archive.org/details/images.g4tv.com-290001-to-300000_l-images-tar [17:33] looks like the server is offline right now [18:11] I wonder if BT will notice that I've uploaded upwards of 450GB in the past couple of weeks. [18:12] Meh, fuck it. Seed, seed, seed. [18:30] mistym: did it work? [18:32] balrog: Seems like it's working with VLC, other software's not liking the video as much. So it wasn't the downloading end that was the problem after all. [18:32] may need to run it through ffmpeg to convert into a more sane container since it looks like flv [19:38] so, what about a browser plugin that uses youtube-dl to download every youtube video you see? [19:41] I've seen some things on youtube I'd rather not have on my hard drive [19:41] Don't access this site with scripts running. Somebody defaced a website in Bristol, so I am informed. http://www.bristolgigs.com/ [19:41] DFJustin has seen the lord of the dancy-dance [19:43] I saved a copy of the defaced page. [19:45] w0rp: I don't see anything on that page that looks like an exploit-by-JS [19:45] w0rp: or is it just an annoying script [19:50] I just didn't run whatever it was. [19:51] I'm NoScript guy. [19:51] *I'm a [20:07] I've been getting paranoid about the Internet Archive… Archive Team has a lot of data there, but what if the Internet Archive goes down? is it mirrored anywhere? What about things that get deleted from the Internet Archive, e.g. old films with copyright issues, or spam — will they become available when they enter the public domain? Ehhh I'm a worrywart, but curious XD [20:08] kyan: There's a full backup of archive.org at the Library of Alexandria [20:08] Paranoia is one of the three virtues of an archivist. [20:08] ^= This [20:08] Kleptomania is the second [20:08] Then RAGE. [20:11] heh. thanks for the answer [20:11] XD [20:12] is there any good place to put things that are rare but under copyright, so they can be saved time-capsule style until they enter public domain? [20:13] vacuum-sealed chamber inside a mountain [20:13] right… [20:13] given the state of copyright you're going to need that much time [20:14] I'd also need a couple generous grants. [20:14] kyan: simple: cause them to no longer be rare [20:14] fwiw IA doesn't actually delete things behind the scenes [20:14] DFJustin, that's a BIG relieaf [20:15] that was one of my biggest worries about IA tbh [20:16] I don't think the alexandria backup is a full one though, only wayback up to 2007 http://www.bibalex.org/isis/frontend/archive/archive_web.aspx [20:16] Making things not rare is kind of hard. I mean, I can promote indie bands and such, but what about 5GB of three-year-old Iranian independent news podcasts? [20:16] oh, I thought you were talking about physical objects [20:16] We need rich benefactors to host copies of all of IA's content. I wonder how much it would cost to take a backup. [20:16] hmm [20:16] in that case, what's the issue with cp(1) [20:16] "The archive at the Bibliotheca Alexandrina (BA) now includes 70 billion webpages covering the period 1996–2007, 2000 hours of Egyptian and US television broadcasts, 1,000 archival films and 25,000 digitized books acquired through the Open Content Alliance (OCA) consortium. It is capable of storing 3.7 petabytes of data on 1636 computers." [20:17] no, I'm talking about digital media [20:17] cool didn't know about the other stuff [20:18] still not the whole archive though. [20:18] indeed [20:18] Nuke hits san francisco and we're in trouble [20:18] :( [20:18] I mean, hopefully never be an issue [20:18] earthquake is probably more likely [20:18] 1906 earthquake take II is probably efb [20:18] yah [20:19] on the other hand, it'd also take out a lot of Web 2.0 companies [20:19] so it's not all bad on balance [20:20] They should all be mirrored and archived! [20:20] IMO all the world's information should go into a single archive, which should be archived many times over worldwide [20:21] well a single mirrored archive is vulnerable to other failure modes [20:22] hmm… maybe make snapshots of it yearly and bury them in that vacuum sealed mountain? [20:22] you want a variety of archives using different physical storage methods and software setups [20:22] right [20:24] best option would be physical (e.g. engraved metal or something) probably… digital storage is quite fragie [20:24] *fragile [20:25] I guess in the end it comes down to money, money, money, money [20:25] http://www.norsam.com/rosetta.html [20:25] writings on stone tablets sure lasted for a long time... [20:26] when launching into commercial space becomes more cost-effective, send up rad-hardened drives into orbit [20:26] it'd be totally impractical but it might make for a good realization of science fiction stories [20:27] i.e. post-apocalypse satellite crash [20:27] reminds me of this… http://rosettaproject.org/disk/concept/ [20:28] some guy spends the rest of his life trying to figure out how to plug in a SATA connector [20:28] I really want one of those but 10 big ones is a little steep [20:31] stone tablet writing on how to build lenses/magnifying glass -> next you can have microfiche for building up more technology -> ....?.... -> SATA connector [20:32] that would be interesting… postapocalyptic technological bootstrapping [20:42] hmm, what about this disaster… https://archive.org/post/778/exclusions-from-the-wayback-machine The content still hasn't gotten restored (https://web.archive.org/web/*/http://www.demon.co.uk/castle/audit/); one poster wrote "…Internet Archive has apparently followed only half of the DMCA procedure - taking down material without allowing the content owner…an appeal…". Seems like kind of a significant unresolved issue. [22:05] well this is desparate [22:05] https://twitter.com/yungcutup [22:09] Twitter spam always seems like this particular style of sad. [22:10] sad, or genius? [22:19] sad [22:19] I found that because that account messaged my chatbot [22:20] https://twitter.com/Stirspeare/status/398661999913369601 for context [22:24] what does your chatbot do? [22:25] it sits in a bunch of IRC channels, feeds that to a MegaHAL instance, and then outputs it to Twitter [22:25] excellent [22:25] it was the most legitimate application of Twitter I could think of [22:27] seems about right [22:58] outputs what? [23:08] https://www.youtube.com/watch?v=1-O_lFm-mBg It's Mandatory Cute Puppy Time.