[00:00] *** okeuday1 has quit IRC (Read error: Connection reset by peer) [00:15] *** okeuday has joined #archiveteam [00:28] *** antomatic has quit IRC (Ping timeout: 265 seconds) [00:31] *** antomatic has joined #archiveteam [00:32] *** MMovie has joined #archiveteam [00:32] *** zenguy_pc has quit IRC (Read error: Operation timed out) [00:34] *** MMovie1 has quit IRC (Read error: Operation timed out) [01:17] *** Ymgve has quit IRC () [01:35] *** zenguy_pc has joined #archiveteam [01:51] *** primus104 has quit IRC (Leaving.) [01:55] *** zenguy_pc has quit IRC (Read error: Operation timed out) [02:00] *** Morbus has joined #archiveteam [02:03] *** zenguy_pc has joined #archiveteam [02:42] *** ete_ has quit IRC (Remote host closed the connection) [02:55] Back [02:56] Down to just two directories of verizon warc.gzs [02:56] After that, they need to be bound into megawarcs but all told, this will hopefully be behind us and maybe by the end of the weekend, FOS will function like it used to. [02:56] Maybe. [02:56] It'll certainly have more inodes [03:05] inodes are like herpes, you can never have too many- wait, no, that's not how the saying goes [03:07] inodes are like sayings from joepie91 - you can have too many and they might all indeed suck eggs [03:07] :( [03:12] I dunno what it is about sucking eggs I just love using the phrase [03:17] *** Jonimus has quit IRC (Quit: WeeChat 1.0.1) [03:21] *** Jonimus has joined #archiveteam [03:30] *** mistym has joined #archiveteam [03:34] you guys will be starting to get radionz mp3 next week [03:34] cause i'm crazy [03:36] *** philpem has quit IRC (Ping timeout: 272 seconds) [04:06] Down to 2 nightmare directories. [04:07] SketchCow: i'm listing to your Boss [04:11] He is very very hoarse [04:21] i noticed that [04:22] after this he can rest his voice i hope [04:36] *** godane has quit IRC (Read error: Operation timed out) [04:58] *** aaaaaaaaa has quit IRC (Leaving) [05:09] *** LordNigh2 has joined #archiveteam [05:11] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [05:12] *** dashcloud has joined #archiveteam [05:19] *** Lord_Nigh has quit IRC (Ping timeout: 600 seconds) [05:19] *** LordNigh2 is now known as Lord_Nigh [05:51] *** godane has joined #archiveteam [06:12] *** signius has quit IRC (Read error: Operation timed out) [06:25] *** signius has joined #archiveteam [07:40] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [07:42] *** dashcloud has joined #archiveteam [07:53] *** primus104 has joined #archiveteam [08:02] *** mistym has quit IRC (Remote host closed the connection) [08:40] *** T31M has quit IRC (Quit: Leaving) [08:41] *** godane has quit IRC (Ping timeout: 633 seconds) [09:00] *** signius has quit IRC (Read error: Operation timed out) [09:59] *** Kenshin has quit IRC (Ping timeout: 258 seconds) [09:59] *** Kenshin has joined #archiveteam [10:01] *** hive-mind has quit IRC (Ping timeout: 272 seconds) [10:02] *** hive-mind has joined #archiveteam [10:16] *** Kenshin has quit IRC (Ping timeout: 246 seconds) [10:24] *** primus104 has quit IRC (Leaving.) [10:25] *** Kenshin has joined #archiveteam [10:32] SketchCow: cf, computerfreek, has discovered over 1 million viddy items [10:33] I'm going to download some of them and get an estimate of the total size [10:34] I'm already only downloading the high quality versions of all videos and not the medium and low quality versions [10:40] *** BlueMaxim has quit IRC (Quit: Leaving) [10:56] SketchCow: with the current 1 million items we have I'm estimating this to be around 2.5 to 3 TB (that's with only the high quality version) [10:56] I will also do a grab with all the three version (high, medium, low) and see what the size will be then [11:21] *** SN4T14 has quit IRC (Ping timeout: 335 seconds) [11:21] *** Ymgve has joined #archiveteam [11:22] *** SN4T14 has joined #archiveteam [11:22] SketchCow: looks like items are around 60-70% bigger when we download all three video sizes [11:24] 50-60%, actually [11:25] That makes it around 4-5 TB in size total for the 1 million items we know exist. However, more items might be discovered in the future, which can up the total size by a few TB [11:26] I'd say let's do all three versions, but I don't know how the situation on space is at IA, so what do you think we should do? [11:29] *** schbirid has joined #archiveteam [11:45] *** ohhdemgir has quit IRC (Quit: Leaving) [11:51] *** signius has joined #archiveteam [12:38] *** midas1 is now known as Midas [12:38] *** Midas is now known as midas [13:16] *** ex-parro1 has quit IRC (Read error: Operation timed out) [13:19] *** ex-parrot has quit IRC (Read error: Operation timed out) [13:23] *** ex-parrot has joined #archiveteam [13:23] *** ex-parro1 has joined #archiveteam [14:08] *** godane has joined #archiveteam [14:14] *** philpem has joined #archiveteam [14:47] *** ohhdemgir has joined #archiveteam [14:52] *** dashcloud has quit IRC (Quit: No Ping reply in 180 seconds.) [14:53] *** dashcloud has joined #archiveteam [15:02] *** mistym has joined #archiveteam [15:06] Just the largest size. [15:06] Why grab three versions? [15:10] *** mistym has quit IRC (Remote host closed the connection) [15:10] *** Emcy_ has joined #archiveteam [15:11] beecause the medium size is actually used in the webpages, not the high quality size [15:12] And I think the more original data we grab from site, the better [15:12] *** primus104 has joined #archiveteam [15:14] I think the purpose of this page is middling at best. [15:15] And so I'd rather we ran a backup of the data than have intermediates. [15:15] Someone can generate an intermediate down the line if needed. [15:16] *** Emcy has quit IRC (Ping timeout: 480 seconds) [15:16] Down to one nightmare directory left on FOS [15:27] *** mistym has joined #archiveteam [15:32] *** Emcy has joined #archiveteam [15:35] *** Emcy_ has quit IRC (Ping timeout: 480 seconds) [15:36] *** aaaaaaaaa has joined #archiveteam [15:58] *** Start has joined #archiveteam [16:01] *** PepsiMax_ has joined #archiveteam [16:01] *** PepsiMax_ has quit IRC (Client Quit) [16:26] *** mistym has quit IRC (Remote host closed the connection) [17:02] *** dx has quit IRC (Remote host closed the connection) [17:03] *** dx has joined #archiveteam [17:15] *** mistym has joined #archiveteam [17:18] *** mistym has quit IRC (Remote host closed the connection) [17:19] *** mistym has joined #archiveteam [17:36] *** dxdx has joined #archiveteam [17:36] *** dx has quit IRC (Read error: Connection reset by peer) [17:39] *** philpem has quit IRC (Ping timeout: 272 seconds) [17:54] *** signius has quit IRC (Remote host closed the connection) [17:55] *** phuzion has quit IRC (Quit: No Ping reply in 180 seconds.) [17:57] *** phuzion has joined #archiveteam [18:04] *** signius has joined #archiveteam [18:17] *** phuzion has quit IRC (Read error: Operation timed out) [18:17] *** phuzion has joined #archiveteam [18:21] *** mistym has quit IRC (Remote host closed the connection) [18:43] *** godane has quit IRC (Ping timeout: 272 seconds) [18:52] *** primus104 has quit IRC (Leaving.) [18:55] *** primus104 has joined #archiveteam [18:55] *** primus104 has quit IRC (Client Quit) [18:59] *** raylee has quit IRC (Quit: ZNC - http://znc.in) [19:02] *** mistym has joined #archiveteam [19:10] *** godane has joined #archiveteam [19:25] TELE2 project officially done. [19:25] congrads! [19:26] Fos happy again? [19:26] no. [19:26] verizon! [19:26] No, TELE2 was one of three miserable destructions. [19:26] Verizon and Swipnet are the other two. [19:26] The rest, like Halo and Archivebot, generate reasonable amounts of files, and the system isn't in pain. [19:27] I have one directory here. It's going from an rsync'd user upload of verizon accounts to the staging directory that will make the megawarc. [19:27] It has been copying files, just copying them, from one directory to another, for 2 solid days, and counting. [19:30] small files break stuff [19:35] They really do, in this case. FOS is overoaded. I've begun moving some operations to SIS. [19:36] SWIPNET has another 17 packs of material go through. [19:37] VERIZON has four, but those four are unclear on their sizes - I basically had to guess [19:39] fr€'~B ~K7~mh,~r2v72 [19:39] Exactly. Like that. [19:41] SRY, CAT ATTACL [19:42] can some start a archive of this every 5 to 6 days: http://downloads.bbc.co.uk/podcasts/worldservice/newshour/rss.xml [19:42] and busted the damn kb ¬_¬; right. Stopping now. [19:42] they delete them after 7 days [19:42] godane: yeah they do all their podcasts, I looked at it before [19:42] so many tho [19:45] http://i.imgur.com/43fhrpX.gifv [19:49] o~_O [19:55] *** mistym has quit IRC (Remote host closed the connection) [20:00] *** primus104 has joined #archiveteam [20:01] so some good news on radionz [20:02] looks like they start there radio talk year about a month after christmas [20:02] SketchCow: you must be happy that we didn't send Isohunt stuff at FOS :) [20:03] in its original form [20:03] so this means they did just pick everything after jan 21 2008 to delete everything [20:03] so this means the 2008 year maybe full [20:13] *** garyrh has quit IRC (Remote host closed the connection) [20:19] godane: I wrote a Python 3 script to download MP3s from the RSS feed. https://bpaste.net/show/bb3b816a3c73 [20:21] Shall I add it to a cronjob to run every once and a while, for how long, and where should the files get uploaded to in the end? [20:21] Sorry, not "for how long," how often should it check? [20:24] *** nblr_ has joined #archiveteam [20:25] *** nblr has quit IRC (Ping timeout: 365 seconds) [20:29] *** primus105 has joined #archiveteam [20:30] was the data from Sony ever released? [20:34] *** primus104 has quit IRC (Read error: Operation timed out) [20:34] *** garyrh has joined #archiveteam [20:46] *** slash` has quit IRC (Ping timeout: 480 seconds) [20:54] *** ex-parrot has quit IRC (Leaving.) [20:54] *** ex-parro1 has quit IRC (Leaving.) [20:54] *** ex-parrot has joined #archiveteam [20:55] *** ex-parro1 has joined #archiveteam [20:57] *** Lord_Nigh has quit IRC (Read error: Operation timed out) [20:59] *** Lord_Nigh has joined #archiveteam [21:00] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [21:04] *** Diesel_ has quit IRC (Remote host closed the connection) [21:09] *** mistym has joined #archiveteam [21:09] *** slash` has joined #archiveteam [21:10] *** godane has quit IRC (Ping timeout: 258 seconds) [21:15] *** mistym has quit IRC (Remote host closed the connection) [21:17] *** mistym has joined #archiveteam [21:19] *** ruukasu has joined #archiveteam [21:21] *** BlueMaxim has joined #archiveteam [21:26] *** BiggieJo1 has quit IRC (Read error: Connection reset by peer) [21:53] *** useretail has joined #archiveteam [22:17] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [22:17] *** ruukasu has joined #archiveteam [22:32] *** ruukasu has quit IRC (Quit: WeeChat 1.0.1) [22:37] *** ruukasu has joined #archiveteam [22:54] *** godane has joined #archiveteam [23:04] *** mistym has quit IRC (Remote host closed the connection) [23:05] *** philpem has joined #archiveteam [23:16] *** mistym has joined #archiveteam [23:37] *** ex-parro2 has joined #archiveteam [23:48] *** ex-parro2 has quit IRC (Remote host closed the connection)