#archiveteam-bs 2018-09-20,Thu

↑back Search

Time Nickname Message
00:11 🔗 bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…)
00:23 🔗 archodg_ has joined #archiveteam-bs
00:26 🔗 archodg has quit IRC (Ping timeout: 252 seconds)
01:38 🔗 albardin is now known as Albardin_
01:38 🔗 Albardin_ is now known as Albardin
01:42 🔗 Valentine has quit IRC (Ping timeout: 506 seconds)
01:45 🔗 Valentine has joined #archiveteam-bs
01:55 🔗 godane latest digitize tapes and scans : https://www.patreon.com/posts/digitize-tapes-21524135
02:08 🔗 Mateon1 has quit IRC (Ping timeout: 268 seconds)
02:08 🔗 Mateon1 has joined #archiveteam-bs
03:37 🔗 joepie91 astrid: IPFS deals better with things on a per-file level, and has mutable pointers (and was originally designed as application-embeddable though that seems to have kind of gotten lost); besides that, it is functionally equivalent
03:37 🔗 joepie91 per-file level as in, you don't need to group things into arbitrary 'collections' (torrents) that don't share file-level data between them
03:48 🔗 archodg__ has joined #archiveteam-bs
03:50 🔗 archodg_ has quit IRC (Ping timeout: 252 seconds)
04:07 🔗 underscor has quit IRC (Read error: Operation timed out)
04:07 🔗 JAA has quit IRC (Read error: Operation timed out)
04:07 🔗 underscor has joined #archiveteam-bs
04:07 🔗 swebb sets mode: +o underscor
04:08 🔗 jspiros has quit IRC (Read error: Operation timed out)
04:09 🔗 godane has quit IRC (west.us.hub irc.Prison.NET)
04:09 🔗 superkuh has quit IRC (west.us.hub irc.Prison.NET)
04:09 🔗 achip has quit IRC (west.us.hub irc.Prison.NET)
04:10 🔗 zyphlar has quit IRC (Ping timeout: 246 seconds)
04:10 🔗 c4rc4s has quit IRC (Read error: Operation timed out)
04:10 🔗 Petri152 has quit IRC (Ping timeout: 246 seconds)
04:11 🔗 superkuh_ has joined #archiveteam-bs
04:22 🔗 achip has joined #archiveteam-bs
04:40 🔗 godane has joined #archiveteam-bs
04:40 🔗 svchfoo1 sets mode: +o godane
05:09 🔗 c4rc4s has joined #archiveteam-bs
05:09 🔗 zyphlar has joined #archiveteam-bs
05:09 🔗 Petri152 has joined #archiveteam-bs
05:12 🔗 JAA has joined #archiveteam-bs
05:12 🔗 swebb sets mode: +o JAA
05:12 🔗 bakJAA sets mode: +o JAA
05:13 🔗 jspiros has joined #archiveteam-bs
05:19 🔗 archodg_ has joined #archiveteam-bs
05:21 🔗 archodg__ has quit IRC (Ping timeout: 252 seconds)
05:54 🔗 sam-p has quit IRC (Quit: Real life is a bare neccessity of life)
06:12 🔗 astrid ah neat, thx
08:01 🔗 Raccoon Who here digs archive container file formats?
08:02 🔗 Raccoon 2018-07-20 -- Xz format (LZMA2, .7z) inadequate for long-term archiving -> http://lzip.nongnu.org/xz_inadequate.html
08:03 🔗 Raccoon We need to reinvent the wheel and come up with a perfect all-purpose container.
08:49 🔗 godane SketchCow: you maybe getting some Japanese craft magazines soon
08:50 🔗 godane i found a website that has a ton of scans when i was looking for free japanese magazines
09:02 🔗 godane so good news i found the original source maybe for those japanese craft magazines
09:03 🔗 godane turns out there is a russian website (i think) that has tons of these magazines
09:03 🔗 godane i think a lot maybe watermark but its better then nothing
09:04 🔗 godane example : http://giftjap.info/freebook/gallery_detailed.php?n=4409
09:07 🔗 eprillios has quit IRC (Read error: Operation timed out)
09:08 🔗 eprillios has joined #archiveteam-bs
09:43 🔗 coldice has joined #archiveteam-bs
10:00 🔗 logchfoo2 starts logging #archiveteam-bs at Thu Sep 20 10:00:22 2018
10:00 🔗 logchfoo2 has joined #archiveteam-bs
10:03 🔗 svchfoo3 has joined #archiveteam-bs
10:04 🔗 Raccoon has quit IRC (Killed (Silence (just be quiet now.)))
10:04 🔗 Raccoon has joined #archiveteam-bs
10:04 🔗 svchfoo1 sets mode: +o svchfoo3
10:35 🔗 eientei95 [20:03:29] <Raccoon> We need to reinvent the wheel and come up with a perfect all-purpose container.
10:35 🔗 eientei95 https://xkcd.com/927/
10:57 🔗 logchfoo4 starts logging #archiveteam-bs at Thu Sep 20 10:57:34 2018
10:57 🔗 logchfoo4 has joined #archiveteam-bs
11:05 🔗 Cameron_D has joined #archiveteam-bs
11:20 🔗 apache2 thefatpunk.dk is being discontinued, don't know if that's in scope for AT?
11:21 🔗 Raccoon eientei95: at least, there should be a comprehensive outline of container formats, their strengths and weaknesses, unique features and undesired flaws.
11:22 🔗 Raccoon I'm continually discovering that half the new file formats I encounter are just a zip or rar or gzip
11:23 🔗 coldice has joined #archiveteam-bs
11:23 🔗 eientei95 Raccoon: I'm a personal advocate of passworded split RAR files where the RAR file is put on a pen drive/SD card with the password written down in code on a piece of paper by a doctor and all bundled in an envelope
11:28 🔗 coldice has quit IRC (Ping timeout: 260 seconds)
11:33 🔗 Raccoon still faster and cheaper to mail an 8TB HDD across the world. is there a lot of that in here?
11:35 🔗 JAA Never understimate the bandwidth of a container filled with microSD cards.
11:37 🔗 Raccoon unless it's a chinese shipping container. hideous packet loss. (counterfeit sd; containers overboard)
11:39 🔗 JAA apache2: eientei95 added it to ArchiveBot. Not sure when that job will start though. I hope it happens before the site goes down.
11:48 🔗 apache2 JAA: thanks! I'm making sure the actual files get mirrored, but for now if someone could make sure the metadata gets crawled that would be amazing
12:07 🔗 Albardin has quit IRC (Read error: Operation timed out)
12:08 🔗 zhongfu has quit IRC (Remote host closed the connection)
12:08 🔗 kiskabak has quit IRC (Ping timeout: 268 seconds)
12:10 🔗 zhongfu has joined #archiveteam-bs
12:16 🔗 plue has quit IRC (Quit: leaving)
12:30 🔗 bitBaron has joined #archiveteam-bs
13:10 🔗 BlueMax has quit IRC (Quit: Leaving)
14:25 🔗 BartoCH has quit IRC (Ping timeout: 615 seconds)
14:55 🔗 bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…)
15:02 🔗 bitBaron has joined #archiveteam-bs
16:01 🔗 schbirid has joined #archiveteam-bs
16:05 🔗 bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…)
16:26 🔗 bitBaron has joined #archiveteam-bs
16:31 🔗 wp494 has quit IRC (Ping timeout: 268 seconds)
16:31 🔗 wp494 has joined #archiveteam-bs
17:05 🔗 Dimtree has joined #archiveteam-bs
17:43 🔗 astrid what's wrong with .tar.gz
17:45 🔗 joepie91 doesn't have indexes :P
17:47 🔗 ndiddy has joined #archiveteam-bs
17:56 🔗 JAA I didn't read that text about how .xz is inadequate now (though I think I've read it at some point in the past), but skimming over it, at least half of the points raised there are simply "it might fail when there's corruption". Well, prevent the corruption then: use a checksumming file system, make backups, etc. There is no file format in the world that will prevent corruption, and there never will be
17:57 🔗 JAA since it's an impossible thing to solve simply with the design of a file format.
17:58 🔗 astrid yeah but you can make corruption recoverable at later points in the file, and you can have robust checksums so you know when it is corrupted
18:01 🔗 JAA Sure, but I don't really see why that has to be part of the archive format itself. You can just as well handle that outside of it with something like PAR2.
18:02 🔗 astrid files tend to be separated from their metadata, from their containers
18:02 🔗 astrid it costs very little to make the file itself resilient
18:03 🔗 JAA True, but it makes the algorithm and the software needed much more complex as well.
18:03 🔗 JAA And at the same time, you usually get less flexibility than if you use a standalone parity calculation.
18:03 🔗 astrid all software needs error handling
18:04 🔗 astrid software needs to cope with corrupted data. it's a fact of life. files will get damaged.
18:04 🔗 astrid from a "make useful software" perspective, code that is able to say "oops, this is corrupted, it may not give good results, i can skip the damage if you want" is good code
18:05 🔗 astrid yes, package files in robust containers
18:05 🔗 astrid but also make files robust themselves
18:05 🔗 astrid (as much as necessary; not more)
18:06 🔗 astrid akin's laws of spacecraft design
18:06 🔗 astrid number 1. Engineering is done with numbers. Analysis without numbers is only an opinion.
18:06 🔗 JAA Well yes, but that's just corruption detection, not corruption fixing. For the former, you can just prepend or append a checksum of the entire file. The latter requires much more complexity.
18:06 🔗 astrid i thought we were talking about corruption detection
18:07 🔗 JAA I guess we were talking about both at the same time.
18:09 🔗 astrid also number 13. Design is based on requirements. There's no justification for designing something one bit "better" than the requirements dictate.
18:18 🔗 JAA I agree that having corruption detection in an archive (or really any) file format makes sense in many cases. But I don't think corruption correction belongs in there. That's a very general problem, and so I think it needs to be solved in a general way as well, e.g. on the file-system level, with RAID, or with backups.
18:22 🔗 astrid yes
18:22 🔗 astrid we need to be able to depend on computers
18:23 🔗 astrid self-sealing file formats are good too. it's useful to have assurance that nothing has been modified about a file since it was created.
18:23 🔗 astrid like, i have a bunch of photos from my life. i don't want them to be modified at all. i've carted them around for a while and they have lived in many different systems
18:24 🔗 astrid some of which do not support this kind of assurance
18:24 🔗 astrid we live in the real world where computer systems are shitty and we don't have control over them
18:24 🔗 JAA Yeah
18:25 🔗 astrid now all my important things live in zfs, which checksums everything
18:25 🔗 astrid does that mean i am going to turn off all my other checksumming mechanisms?
18:25 🔗 astrid no
18:28 🔗 JAA Sure, there's no reason really not to do corruption detection at different levels. Detection is cheap and easy.
18:28 🔗 astrid i'm not sure what we disagree about exactly
18:28 🔗 astrid are you saying that extra parity information doesn't belong in files?
18:28 🔗 JAA Neither am I. :-)
18:29 🔗 JAA Ah, yep.
18:29 🔗 astrid ok
18:29 🔗 astrid i agree more or less with that stance
18:30 🔗 astrid with the caveat that:
18:30 🔗 astrid file formats have intended use cases
18:30 🔗 astrid if such a use case is environments hostile to data, the file needs to be resilient to that environment
18:30 🔗 astrid otherwise it is not fit for the purpose for which it has been ostensibly designed
18:33 🔗 chferfa has joined #archiveteam-bs
18:36 🔗 JAA Yeah, that's true.
18:56 🔗 bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…)
19:09 🔗 bitBaron has joined #archiveteam-bs
19:16 🔗 BartoCH has joined #archiveteam-bs
19:36 🔗 schbirid has quit IRC (Remote host closed the connection)
19:53 🔗 bitBaron has quit IRC (My computer has gone to sleep. 😴😪ZZZzzz…)
20:29 🔗 bitBaron has joined #archiveteam-bs
20:30 🔗 BartoCH has quit IRC (Quit: WeeChat 2.2)
20:35 🔗 BartoCH has joined #archiveteam-bs
20:45 🔗 closure_ has quit IRC (Read error: Operation timed out)
20:50 🔗 ndiddy has quit IRC (Ping timeout: 257 seconds)
20:58 🔗 closure has joined #archiveteam-bs
21:15 🔗 superkuh_ is now known as superkuh
21:33 🔗 BlueMax has joined #archiveteam-bs
21:40 🔗 closure has quit IRC (Read error: Connection reset by peer)
21:43 🔗 closure has joined #archiveteam-bs
21:59 🔗 closure has quit IRC (Read error: Connection reset by peer)
22:02 🔗 t2t2 has quit IRC (Read error: Operation timed out)
22:03 🔗 t2t2 has joined #archiveteam-bs
22:06 🔗 closure has joined #archiveteam-bs
22:20 🔗 BlueMax has quit IRC (Quit: Leaving)
22:26 🔗 ndiddy has joined #archiveteam-bs
22:31 🔗 Stiletto has quit IRC (Read error: Operation timed out)
22:33 🔗 closure has quit IRC (Read error: Operation timed out)
22:52 🔗 closure has joined #archiveteam-bs
22:56 🔗 bitBaron has quit IRC (Quit: My computer has gone to sleep. 😴😪ZZZzzz…)
22:59 🔗 closure has quit IRC (Read error: Connection reset by peer)
23:01 🔗 Raccoon has quit IRC (Quit: Just stay out of my wife!)
23:02 🔗 closure has joined #archiveteam-bs
23:18 🔗 Stilett0 has joined #archiveteam-bs
23:20 🔗 bitBaron has joined #archiveteam-bs
23:20 🔗 bitBaron has quit IRC (Client Quit)
23:21 🔗 bitBaron has joined #archiveteam-bs
23:33 🔗 closure has quit IRC (Read error: Connection reset by peer)
23:35 🔗 closure has joined #archiveteam-bs
23:52 🔗 antomati_ has quit IRC (Read error: Connection reset by peer)
23:52 🔗 antomatic has joined #archiveteam-bs
23:52 🔗 swebb sets mode: +o antomatic
23:53 🔗 ndiddy has quit IRC (Read error: Operation timed out)

irclogger-viewer