[00:11] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [00:23] *** archodg_ has joined #archiveteam-bs [00:26] *** archodg has quit IRC (Ping timeout: 252 seconds) [01:38] *** albardin is now known as Albardin_ [01:38] *** Albardin_ is now known as Albardin [01:42] *** Valentine has quit IRC (Ping timeout: 506 seconds) [01:45] *** Valentine has joined #archiveteam-bs [01:55] latest digitize tapes and scans : https://www.patreon.com/posts/digitize-tapes-21524135 [02:08] *** Mateon1 has quit IRC (Ping timeout: 268 seconds) [02:08] *** Mateon1 has joined #archiveteam-bs [03:37] astrid: IPFS deals better with things on a per-file level, and has mutable pointers (and was originally designed as application-embeddable though that seems to have kind of gotten lost); besides that, it is functionally equivalent [03:37] per-file level as in, you don't need to group things into arbitrary 'collections' (torrents) that don't share file-level data between them [03:48] *** archodg__ has joined #archiveteam-bs [03:50] *** archodg_ has quit IRC (Ping timeout: 252 seconds) [04:07] *** underscor has quit IRC (Read error: Operation timed out) [04:07] *** JAA has quit IRC (Read error: Operation timed out) [04:07] *** underscor has joined #archiveteam-bs [04:07] *** swebb sets mode: +o underscor [04:08] *** jspiros has quit IRC (Read error: Operation timed out) [04:09] *** godane has quit IRC (west.us.hub irc.Prison.NET) [04:09] *** superkuh has quit IRC (west.us.hub irc.Prison.NET) [04:09] *** achip has quit IRC (west.us.hub irc.Prison.NET) [04:10] *** zyphlar has quit IRC (Ping timeout: 246 seconds) [04:10] *** c4rc4s has quit IRC (Read error: Operation timed out) [04:10] *** Petri152 has quit IRC (Ping timeout: 246 seconds) [04:11] *** superkuh_ has joined #archiveteam-bs [04:22] *** achip has joined #archiveteam-bs [04:40] *** godane has joined #archiveteam-bs [04:40] *** svchfoo1 sets mode: +o godane [05:09] *** c4rc4s has joined #archiveteam-bs [05:09] *** zyphlar has joined #archiveteam-bs [05:09] *** Petri152 has joined #archiveteam-bs [05:12] *** JAA has joined #archiveteam-bs [05:12] *** swebb sets mode: +o JAA [05:12] *** bakJAA sets mode: +o JAA [05:13] *** jspiros has joined #archiveteam-bs [05:19] *** archodg_ has joined #archiveteam-bs [05:21] *** archodg__ has quit IRC (Ping timeout: 252 seconds) [05:54] *** sam-p has quit IRC (Quit: Real life is a bare neccessity of life) [06:12] ah neat, thx [08:01] Who here digs archive container file formats? [08:02] 2018-07-20 -- Xz format (LZMA2, .7z) inadequate for long-term archiving -> http://lzip.nongnu.org/xz_inadequate.html [08:03] We need to reinvent the wheel and come up with a perfect all-purpose container. [08:49] SketchCow: you maybe getting some Japanese craft magazines soon [08:50] i found a website that has a ton of scans when i was looking for free japanese magazines [09:02] so good news i found the original source maybe for those japanese craft magazines [09:03] turns out there is a russian website (i think) that has tons of these magazines [09:03] i think a lot maybe watermark but its better then nothing [09:04] example : http://giftjap.info/freebook/gallery_detailed.php?n=4409 [09:07] *** eprillios has quit IRC (Read error: Operation timed out) [09:08] *** eprillios has joined #archiveteam-bs [09:43] *** coldice has joined #archiveteam-bs [10:00] *** logchfoo2 starts logging #archiveteam-bs at Thu Sep 20 10:00:22 2018 [10:00] *** logchfoo2 has joined #archiveteam-bs [10:03] *** svchfoo3 has joined #archiveteam-bs [10:04] *** Raccoon has quit IRC (Killed (Silence (just be quiet now.))) [10:04] *** Raccoon has joined #archiveteam-bs [10:04] *** svchfoo1 sets mode: +o svchfoo3 [10:35] [20:03:29] We need to reinvent the wheel and come up with a perfect all-purpose container. [10:35] https://xkcd.com/927/ [10:57] *** logchfoo4 starts logging #archiveteam-bs at Thu Sep 20 10:57:34 2018 [10:57] *** logchfoo4 has joined #archiveteam-bs [11:05] *** Cameron_D has joined #archiveteam-bs [11:20] thefatpunk.dk is being discontinued, don't know if that's in scope for AT? [11:21] eientei95: at least, there should be a comprehensive outline of container formats, their strengths and weaknesses, unique features and undesired flaws. [11:22] I'm continually discovering that half the new file formats I encounter are just a zip or rar or gzip [11:23] *** coldice has joined #archiveteam-bs [11:23] Raccoon: I'm a personal advocate of passworded split RAR files where the RAR file is put on a pen drive/SD card with the password written down in code on a piece of paper by a doctor and all bundled in an envelope [11:28] *** coldice has quit IRC (Ping timeout: 260 seconds) [11:33] still faster and cheaper to mail an 8TB HDD across the world. is there a lot of that in here? [11:35] Never understimate the bandwidth of a container filled with microSD cards. [11:37] unless it's a chinese shipping container. hideous packet loss. (counterfeit sd; containers overboard) [11:39] apache2: eientei95 added it to ArchiveBot. Not sure when that job will start though. I hope it happens before the site goes down. [11:48] JAA: thanks! I'm making sure the actual files get mirrored, but for now if someone could make sure the metadata gets crawled that would be amazing [12:07] *** Albardin has quit IRC (Read error: Operation timed out) [12:08] *** zhongfu has quit IRC (Remote host closed the connection) [12:08] *** kiskabak has quit IRC (Ping timeout: 268 seconds) [12:10] *** zhongfu has joined #archiveteam-bs [12:16] *** plue has quit IRC (Quit: leaving) [12:30] *** bitBaron has joined #archiveteam-bs [13:10] *** BlueMax has quit IRC (Quit: Leaving) [14:25] *** BartoCH has quit IRC (Ping timeout: 615 seconds) [14:55] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [15:02] *** bitBaron has joined #archiveteam-bs [16:01] *** schbirid has joined #archiveteam-bs [16:05] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [16:26] *** bitBaron has joined #archiveteam-bs [16:31] *** wp494 has quit IRC (Ping timeout: 268 seconds) [16:31] *** wp494 has joined #archiveteam-bs [17:05] *** Dimtree has joined #archiveteam-bs [17:43] what's wrong with .tar.gz [17:45] doesn't have indexes :P [17:47] *** ndiddy has joined #archiveteam-bs [17:56] I didn't read that text about how .xz is inadequate now (though I think I've read it at some point in the past), but skimming over it, at least half of the points raised there are simply "it might fail when there's corruption". Well, prevent the corruption then: use a checksumming file system, make backups, etc. There is no file format in the world that will prevent corruption, and there never will be [17:57] since it's an impossible thing to solve simply with the design of a file format. [17:58] yeah but you can make corruption recoverable at later points in the file, and you can have robust checksums so you know when it is corrupted [18:01] Sure, but I don't really see why that has to be part of the archive format itself. You can just as well handle that outside of it with something like PAR2. [18:02] files tend to be separated from their metadata, from their containers [18:02] it costs very little to make the file itself resilient [18:03] True, but it makes the algorithm and the software needed much more complex as well. [18:03] And at the same time, you usually get less flexibility than if you use a standalone parity calculation. [18:03] all software needs error handling [18:04] software needs to cope with corrupted data. it's a fact of life. files will get damaged. [18:04] from a "make useful software" perspective, code that is able to say "oops, this is corrupted, it may not give good results, i can skip the damage if you want" is good code [18:05] yes, package files in robust containers [18:05] but also make files robust themselves [18:05] (as much as necessary; not more) [18:06] akin's laws of spacecraft design [18:06] number 1. Engineering is done with numbers. Analysis without numbers is only an opinion. [18:06] Well yes, but that's just corruption detection, not corruption fixing. For the former, you can just prepend or append a checksum of the entire file. The latter requires much more complexity. [18:06] i thought we were talking about corruption detection [18:07] I guess we were talking about both at the same time. [18:09] also number 13. Design is based on requirements. There's no justification for designing something one bit "better" than the requirements dictate. [18:18] I agree that having corruption detection in an archive (or really any) file format makes sense in many cases. But I don't think corruption correction belongs in there. That's a very general problem, and so I think it needs to be solved in a general way as well, e.g. on the file-system level, with RAID, or with backups. [18:22] yes [18:22] we need to be able to depend on computers [18:23] self-sealing file formats are good too. it's useful to have assurance that nothing has been modified about a file since it was created. [18:23] like, i have a bunch of photos from my life. i don't want them to be modified at all. i've carted them around for a while and they have lived in many different systems [18:24] some of which do not support this kind of assurance [18:24] we live in the real world where computer systems are shitty and we don't have control over them [18:24] Yeah [18:25] now all my important things live in zfs, which checksums everything [18:25] does that mean i am going to turn off all my other checksumming mechanisms? [18:25] no [18:28] Sure, there's no reason really not to do corruption detection at different levels. Detection is cheap and easy. [18:28] i'm not sure what we disagree about exactly [18:28] are you saying that extra parity information doesn't belong in files? [18:28] Neither am I. :-) [18:29] Ah, yep. [18:29] ok [18:29] i agree more or less with that stance [18:30] with the caveat that: [18:30] file formats have intended use cases [18:30] if such a use case is environments hostile to data, the file needs to be resilient to that environment [18:30] otherwise it is not fit for the purpose for which it has been ostensibly designed [18:33] *** chferfa has joined #archiveteam-bs [18:36] Yeah, that's true. [18:56] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [19:09] *** bitBaron has joined #archiveteam-bs [19:16] *** BartoCH has joined #archiveteam-bs [19:36] *** schbirid has quit IRC (Remote host closed the connection) [19:53] *** bitBaron has quit IRC (My computer has gone to sleep. 😴😪ZZZzzz…) [20:29] *** bitBaron has joined #archiveteam-bs [20:30] *** BartoCH has quit IRC (Quit: WeeChat 2.2) [20:35] *** BartoCH has joined #archiveteam-bs [20:45] *** closure_ has quit IRC (Read error: Operation timed out) [20:50] *** ndiddy has quit IRC (Ping timeout: 257 seconds) [20:58] *** closure has joined #archiveteam-bs [21:15] *** superkuh_ is now known as superkuh [21:33] *** BlueMax has joined #archiveteam-bs [21:40] *** closure has quit IRC (Read error: Connection reset by peer) [21:43] *** closure has joined #archiveteam-bs [21:59] *** closure has quit IRC (Read error: Connection reset by peer) [22:02] *** t2t2 has quit IRC (Read error: Operation timed out) [22:03] *** t2t2 has joined #archiveteam-bs [22:06] *** closure has joined #archiveteam-bs [22:20] *** BlueMax has quit IRC (Quit: Leaving) [22:26] *** ndiddy has joined #archiveteam-bs [22:31] *** Stiletto has quit IRC (Read error: Operation timed out) [22:33] *** closure has quit IRC (Read error: Operation timed out) [22:52] *** closure has joined #archiveteam-bs [22:56] *** bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…) [22:59] *** closure has quit IRC (Read error: Connection reset by peer) [23:01] *** Raccoon has quit IRC (Quit: Just stay out of my wife!) [23:02] *** closure has joined #archiveteam-bs [23:18] *** Stilett0 has joined #archiveteam-bs [23:20] *** bitBaron has joined #archiveteam-bs [23:20] *** bitBaron has quit IRC (Client Quit) [23:21] *** bitBaron has joined #archiveteam-bs [23:33] *** closure has quit IRC (Read error: Connection reset by peer) [23:35] *** closure has joined #archiveteam-bs [23:52] *** antomati_ has quit IRC (Read error: Connection reset by peer) [23:52] *** antomatic has joined #archiveteam-bs [23:52] *** swebb sets mode: +o antomatic [23:53] *** ndiddy has quit IRC (Read error: Operation timed out)