[00:00] *** Jon has quit IRC (Quit: ZNC - http://znc.in) [01:28] *** wp494 has quit IRC (Read error: Operation timed out) [01:35] *** wp494 has joined #internetarchive.bak [01:46] *** wp494_ has joined #internetarchive.bak [01:49] *** wp494 has quit IRC (Read error: Operation timed out) [02:25] *** wp494_ is now known as wp494 [04:24] *** wp494 has quit IRC (LOUD UNNECESSARY QUIT MESSAGES) [04:25] *** wp494 has joined #internetarchive.bak [07:18] *** Mateon1 has quit IRC (Ping timeout: 260 seconds) [07:18] *** Mateon1 has joined #internetarchive.bak [10:36] *** Mateon1 has quit IRC (Remote host closed the connection) [10:36] *** Mateon1 has joined #internetarchive.bak [17:41] *** beardicus has quit IRC (bye) [17:45] *** beardicus has joined #internetarchive.bak [21:14] *** sep332 has quit IRC (Read error: Operation timed out) [21:22] *** ez has joined #internetarchive.bak [21:22] Somebody2: well, not sure if this place is more suitabel for what-ifs than -bs but anyway [21:23] ez: Eh, it's still more on the topic. :-) [21:23] And it enables people who don't care about it to ignore it more easily. [21:23] if you have wel define space of items, say 0-100, and everyone picks a random item off it, you get near 99% coverage after 5 or so redundant replicas [21:23] this works without any coordination, provided all participants pick uniformly randomly (they have no reason not to, they're altruistic after all) [21:23] ez: I see. [21:24] But I'm not convinced that the lack of uptake of IA.BAK is due to the (very low) coordination requirement. [21:24] Somebody2: for a starting point, i'd probably consider the timestamped snapshot [21:24] its not perfect, but might work reasonably well enough [21:25] then after some time, do a second snapshot delta from that, and do the same dance [21:25] Also, if you have no coordination, you have no way to *restore* the backup. [21:25] At least, not without out-of-band efforts. [21:25] indeed you dont, you might have only vague estimate how spotty the global backup is [21:26] in fact, i'd just straight burn the randomly chosen items to drives which are not worth the electricity to keep online [21:26] fill 250GB drive, pile it in the closet [21:27] So given that, I'm not sure how much it matters whether the random space is uniform or not. [21:27] Somebody2: the restoration problem is not related to lack of coordination, but having high replica count [21:27] you need high amount of replicas anyway [21:27] Wait, how does having a high replica count affect restoration? [21:28] well, you obviously need to encode with RS 255,255-85 [21:28] What is RS 255,255-85? [21:28] you need 85 things out of 255 to restore original 85 things [21:28] in laymen terms 1of3 ECC code [21:28] OK... [21:28] the numbers are high because you have insane statistic variance [21:29] But if you don't have a way to contact *any* of the replicas, you can't restore in any case. [21:29] Somebody2: so say, iabak goes bust and now everyone needs to restore [21:29] they open their closets [21:29] And if you *do* have a way to contact the replicas, that's coordination. [21:29] and as long at least 1/3 of total what was backed up [21:29] *any* 1/3 [21:29] you get the whole archive [21:30] Ah, so you contact the replicas AFTER THE FACT, when restoration is needed. [21:30] Somebody2: they have useless shards of RS code [21:30] they *need* to coordinate when restoring [21:30] but not when backing up [21:30] I see. OK, yeah, I can see that being a possible improvement. [21:31] It does prevent being able to get any idea of the progress of the backup (without calling for restoration), though. [21:31] Somebody2: it is indeed what-if, as im making different motivation assumptions than you have [21:31] you have assumption that people are interested in seti-at-home online server/client architecture, which is fine [21:32] No, one of the explicit goals of the effort was to allow people to store the HDs offline, and plug in them in briefly once a month or so. [21:32] hmm, thats neat [21:32] though the spin ups could be still seen as a lot of bother [21:32] Yep! [21:33] i mean theres a lot of emphasis on tracking replicas [21:33] which makes no sense [21:33] But if you don't test backup media regularly, you should assume it's unrecoverable. [21:33] ez: Oh? Why does tracking replicas make no sense? [21:34] when massive spread is done with erasure coding and you hit certain average replica count [21:34] whether your drive failed or not doesnt matter as much [21:34] your drive failure slightly lowered chance of recovery across a wide board [21:34] But you'd still need to report back average replica count in order to get progress reports, though. [21:35] Somebody2: of course this makes wild assumptions that there is sufficient capacity, which there isnt [21:35] for stochastic replication to work reasonably, you'd need to restrict the subset [21:35] but you need to do that anyway to achieve uniform randomness [21:35] (which is currently done with shards in fact) [21:36] Somebody2: yea, it would need some fancy statistics [21:36] like, knowing the split of "people keeping data online vs keeping it in closet" [21:36] not sure how to arrive at that number [21:36] but once you have it, you can infer total numbers [21:39] ez: I mean, don't you just need a count of "replicas (of any single shard)"? [21:39] Somebody2: in practice, the stochastic domain would live each in their shard, yea [21:40] so, a single volunteer would pick a shard, and start picking random items off it [21:40] until through some vague quorum prototocol it is agreed the shard is sufficient [21:40] then it moves onto another shard [21:41] quorum can be fairly simple proofs of possesion. however it doesnt solve the closet problem in straighforward manner [21:42] Somebody2: total progress would be metered in terms of shards which with observed sufficient online count (provided we know the closet number) [21:42] ez: Well, what we have now already does that... [21:42] Somebody2: yep. the closet number cant be figured out easily without central authority [21:44] The closet number can't be found out at all. [21:44] under assumptions most participants are honest about it, it can [21:45] Somebody2: it is my understanding current sharding doesnt use erasure coding [21:45] ez: Not without asking people to plug in the HDs in their closet once a month. [21:45] which worsens the situation a lot regarding closet [21:45] as you have no wiggle room [21:45] Which is already what we do, I think. [21:45] The current sharding uses full mirroring, rather than erasure coding, yes (I think). [21:46] Somebody2: one way to figure out closet might be indeed every 6-month or so check [21:46] But we do have multiple replicas of each shard [21:46] So there's the wiggle room. [21:46] I'm not sure how erasure coding would give us more. [21:46] theres also the issue of inefficiency [21:47] Somebody2: you're making assumptions that either whole shard disappears or not [21:47] thats where inefficiency comes from [21:47] in reality, only fragments of shard may disappear [21:48] so any system, centralized or not, has to make sure that theres enough fragments in each shard to make the EC recoverable [21:48] ez: No, if we have 4 full copies of shard3, say -- and each one loses 15%; as long as all four didn't lose the *same* data, we can still recover all of it. [21:49] Somebody2: yes [21:49] first, the chance of 15% overlap is quite high [21:49] second, you lost 15% across the board and already have high chance of failure [21:50] and thats when using 4x more than you need to. [21:50] with 1of3 you can lose 66% across the board, and still have full recoverability [21:51] ez: I see. [21:52] (i still like full mirrors for the simplicity of it, and they do in fact perform better than RS on small sets) [21:52] but RS with aggresive settings like 85 out of 255 works a bit like magic compared to that [21:52] And full mirroring also has the advantage of being transparent to the storage providers [21:53] So people don't have to hold data they don't want to [21:53] yea, with RS everyone would have to hold "garbage" they cant recover without help of bunch of random folks [21:53] So that's why I still think the blocks to further progress on ia.bak are easier to install clients for more platforms, and promotion. [21:56] Somebody2: its kinda moot point anyway, as RS, on big scales, can save, perhaps, 2x-3x storage compared to mirroring. its an improvement, but not vast improvement to warrant the complexity and issue you mention [21:56] Nods. [21:57] ez: Are you intersted/able to write improvements to our existing clients? [21:57] honestly, im quite pessimistic about it [21:57] no way in hell 100PB+ will appear out of thin air [21:58] so im more like daydreaming to shift the paradigm way off, which could, perhaps work better [21:59] rather than incremental improvements of current paradigm im fairly convinced cant be much improved on anymore [22:00] *** sep332 has joined #internetarchive.bak [22:05] *** sep332 has quit IRC (Read error: Operation timed out) [22:09] ez: You really think our current client programs can't be improved on? [22:10] Or do you think they can't be improved on enough to provide 100PB+ out of thin air? [22:10] oh they definitely can, in terms of ux and all, youre entirely right [22:10] (which I agree with, but I don't think that's a reason not to improve them) [22:10] So, interested? [22:10] its just that such an improvement could delivery, say a magnitude or so [22:10] and am the sort of black-and-white all-or-nothing sort of guy [22:11] if its 0.5% or 5%, its just awfuly not enough. the venue of asking government grants for it seems far more viable tbh [22:11] but that doesnt warrant much of improvement on client side [22:15] in terms of lobbying, heres an idea: business often liquidate hardware not worth operating (meaning to keep it online). instead of asking for a grant to buy new hardware, get something rolling in the vein of "ecologic disposal" of such hardware [22:15] Nice idea. [22:15] I sent the email to the Norwegian folks just now. Who knows how it will go, but it's done at least. [22:15] am not sure if the logistics involved are worth it though. we're talking behemot NAS arrays with iscsi 250GB drives in it [22:19] Somebody2: in any case, if a project specificaly targeting hardware much more prone to faults were involved, i'd participate to make a client with RS support [22:19] cause mirroring becomes pretty inadequate with such an architecture [22:20] What hardware would that be? [22:21] basically old hardware you keep off and power it on once a month, bring it all into one bunker, setup infra doing to the power-ons and check. the hardware and electricity costs are neglible, the majority of cost would be physical labor and rent for the bunker. [22:21] Please *DO* work on a client to support hardware like that! [22:23] Somebody2: again, i can pinky pie on the software side, but this is still huge endeawor meatspace-wise [22:24] basically some operator of the "enterprise scrapyard" [22:24] am not even sure such an idea is practical, the hardware is *extremely* inefficient. think 1ton rack full of scrap = 10tb [22:25] (thats the worst case tho, in practice its 100-500tb range) [22:28] so basically shitload of space with not too much of flammable material around almost for free would be adequate. i cant really think of such a place, basically some sort of warehouse in middle of nowhere? [22:53] *** tuluu has joined #internetarchive.bak [22:56] *** tuluu has left [23:16] ez: Eh, if we have the software, it will make working on getting the hardware more attractive. [23:41] It somewhat distresses me how easily the whole thing could be duplicated if money were just thrown at the problem [23:42] *** Senji_ is now known as Senji [23:45] At work, with our current systems, we could turn 100PB into 6500 m^3 of tapes in 10 years (with a little additional investment we could probably bring that 10 years down to 1 year easily. [23:46] I don't think we have space to store that many tapes, but ICBW [23:47] But we'd charge $2.5m a year for that [23:48] (thats two tape copies; I assume we'd charge about $1.5m for one tape copy)