[04:11] https://www.youtube.com/user/JPCMHD Japanese TV Commercials [04:14] (is there a list of channels that have been backed up?) [04:42] i got all of starcade episodes on to archive.org [04:59] I'm out of disk space to back up any more channels lol [06:05] well, I've got that one, as long as you trust the 4TB ext4 partition to survive a few years [10:26] what sort of mail server does give such a response? http://p.defau.lt/?fGiPeGk9JlJJrqvDmVmxjg [10:36] Nemo_bis: looks like some costum message from a spam filter or whatever [10:36] or reputation filter thing [14:01] Next up to upload as MegaWARCs... Splinder! [14:41] my wifi is hating me [14:41] hmmm [14:41] SOLAR FLARE WARNING BY SMILEY. [14:42] i can't upload 128 iso of linux format [14:42] THIS WILL BE PROCEEDED BY OFFICAL WARNINGS BY NASA IN THE NEXT FEW DAYS. [14:43] * SmileyG checks spaceweather.com [14:43] Oh look, more warnings ¬_¬ Wifi is so damn sensitive and no ones realised yet its the perfect tool for spotting solar flare interferance. [14:44] is archive.org s3 having problems? [14:45] I'm uploading to it as I go. [14:46] solar flares! (Ok, I'll stop now). [14:52] So, I'm uploading Splinder into the wayback. [14:53] That leaves a couple more, according to the spreadsheet, that have to go through the wringer. [14:53] Until we're 100% sure this works, we're not touching MobileMe. [14:54] But after I finish the last couple on the spreadsheet, specifically Picplz and Fortunecity, I'd appreciate some effort going through the archiveteam sets and finding stragglers. [14:54] Now, bear in mind - if something already has a CDX file, it's already in the wayback. I.e. we have a TON of stuff Godane added, that's all gone in, even though it's not on the spreadsheet. [14:55] But I'm looking specifically for uploads done where they're .WARC files inside .tar or .zip files. [14:55] How would one go about 'going through the sets and finding stragglers'? [14:55] http://archive.org/details/archiveteam [14:55] Go through "All items" [14:56] In multiple cases, we have two sets of the same files. [14:57] That is, we have 26 UMICH items, but we ALSO have 26 WARC items. That means it's done - I'm just being careful and doubling data until we're secure it's working as advertised. [14:57] Plus, we have orphan items that could probably stand to go into collections where possible. [14:57] This is all a function of #whatnow but I'd like it done, so we're cleaned up for 2013. [14:59] This is JUST for stuff in which WARCs play a part. Obviously we have tons of items with no WARCs, like the Mypodcast rescue [15:00] Also: Damn, http://archive.org/details/archiveteam is one fucking impressive wall of items, downloads, and sites [15:01] Good job [15:04] OK, going after fortunecity [15:07] Shit is wack. [15:34] http://i.imgur.com/eEGC2.jpg [15:48] SketchCow: not warc but needs to go in the archiveteam collection somewhere I would think https://archive.org/details/archiveteam-thingiverse-2012-09 [15:49] Done. [15:49] You will all be shocked to know that transferring millions of files using the megawarc converter slows down other disk operations on that drive. [16:36] http://www.metafilter.com/121172/Goodbye-Cruel-World [16:36] I wonder how hard it would be to extract and archive teletext [16:41] Not [16:46] seems to be a bunch of it on youtube, might be worth doing a keyword grab of some sort [20:03] Deos Warrior delete downloaded content after uploading? I noticed my Warrior image is growing. [20:06] mistym: It should. Have you checked with df? (The warrior image will grow to 60GB, since the empty space on the disk is not claimed back.) [20:07] alard: I'll take a look when I get home. [20:07] If you don't want to give it 60GB you can replace the disk with a smaller disk image; it will be formatted when the warrior boots. [20:13] or a larger one ;) [22:11] Ymgve: ceefax.tv did it [22:23] tef: they are scraping the tv signal directly? [22:24] looks like [22:26] apparently http://www.theregister.co.uk/2005/04/01/ceefax_google/ [22:27] but the content is out of date [22:28] still, nice [22:31] so reading the scrollback, I did a brief check of the items, and I came across Coming Soon, which has one item as WARCS, and there's a second item with a WARC file inside a zipfile [23:33] i'm on the net for now [23:33] my wireless internet is not working [23:34] i had to go wired