[00:38] *** godane has joined #archiveteam-ot [00:40] *** adinbied has quit IRC (Read error: Operation timed out) [00:41] *** adinbied has joined #archiveteam-ot [01:25] Someone ran into the library that I'm sitting in [01:25] (with their car) [01:27] 0o [01:27] *** BlueMax has joined #archiveteam-ot [01:39] *** vectr0n has quit IRC (ZNC - https://znc.in) [01:39] *** vectr0n has joined #archiveteam-ot [02:40] *** kiskabak has quit IRC (Read error: Connection reset by peer) [02:40] *** Albardin has quit IRC (Read error: Connection reset by peer) [02:40] *** w0rmybak has quit IRC (Read error: Connection reset by peer) [03:11] lol what? [04:11] does anyone have any experience with working with ZIM files? [04:14] how good is MWoffliner? [05:25] *** adinbied has quit IRC (hub.efnet.us west.us.hub) [05:25] *** Mateon1 has quit IRC (hub.efnet.us west.us.hub) [05:25] *** mr_archiv has quit IRC (hub.efnet.us west.us.hub) [05:25] *** sknebel has quit IRC (hub.efnet.us west.us.hub) [05:25] *** robogoat has quit IRC (hub.efnet.us west.us.hub) [05:25] *** svchfoo3 has quit IRC (hub.efnet.us west.us.hub) [05:25] *** Jusque has quit IRC (hub.efnet.us west.us.hub) [05:29] *** Stiletto has joined #archiveteam-ot [05:30] *** Mateon1 has joined #archiveteam-ot [05:30] *** mr_archiv has joined #archiveteam-ot [05:30] *** sknebel has joined #archiveteam-ot [05:30] *** robogoat has joined #archiveteam-ot [05:30] *** svchfoo3 has joined #archiveteam-ot [05:30] *** Jusque has joined #archiveteam-ot [05:30] *** irc.mzima.net sets mode: +o svchfoo3 [05:33] *** Stilett0 has quit IRC (Read error: Operation timed out) [05:41] *** Stilett0 has joined #archiveteam-ot [05:46] *** Stiletto has quit IRC (Read error: Operation timed out) [05:53] *** adinbied has joined #archiveteam-ot [06:27] *** robogoat has quit IRC (Read error: Operation timed out) [06:47] *** BlueMax has quit IRC (Read error: Connection reset by peer) [06:48] *** BlueMax has joined #archiveteam-ot [07:00] *** robogoat has joined #archiveteam-ot [07:05] *** BlueMax has quit IRC (Read error: Connection reset by peer) [07:06] *** BlueMax has joined #archiveteam-ot [08:05] *** robogoat has quit IRC (Read error: Operation timed out) [08:13] *** robogoat has joined #archiveteam-ot [08:23] *** robogoat has quit IRC (Read error: Operation timed out) [09:00] *** robogoat has joined #archiveteam-ot [09:35] *** BlueMax has quit IRC (Quit: Leaving) [12:03] *** VerifiedJ has joined #archiveteam-ot [12:55] JAA arkiver you guys around, I have question [12:57] jrwr: Yeah [13:01] I PM'd you JAA [14:11] *** eggplanti has quit IRC (Read error: Connection reset by peer) [15:45] JAA: should I use a single item every time or a new item for every backup I do [15:50] jrwr: Whichever you prefer. Note that large items (over several hundred GB) tend to cause issues apparently. Maybe group by month/season/year, depending on the size and frequency? [15:57] * jrwr pokes SketchCow [15:57] I guess he would be the best to ask [15:58] I have a website I want to export its dataset to the internet archive on a weekly basis, as my userbase has expressed they want this. so far its 10 files and about 50GB [15:59] Whats the way you prefer that it was published to the IA as (even metadata/collection) [16:00] in general IA doesn't like data that is continuously updated [16:01] you could scrape your website as .warc files, and there is a process to put warcs in the wayback machine [16:01] I mostly want somewhere to put the data for long term storage for the public so if someone wanted to clone the site or process the database information [16:02] current plans was a weekly torrent [16:02] whatever it is, weekly is too often, in my opinion [16:02] I get about 2000 new items per wek right now [16:03] I can do once a month [16:05] How about this? You create weekly dumps on your server and make them available there, and less frequently, you push to IA. [16:09] I do agree that weekly dumps are probably a bit much to upload to IA. [16:09] Unless you want to do monthly or seasonal full dumps and weekly increments. [17:05] *** schbirid has joined #archiveteam-ot [17:12] *** wp494 has quit IRC (Ping timeout: 364 seconds) [17:12] *** wp494 has joined #archiveteam-ot [18:53] *** Mateon1 has quit IRC (Read error: Operation timed out) [18:54] *** Mateon1 has joined #archiveteam-ot [19:05] *** godane has quit IRC (Ping timeout: 260 seconds) [19:06] jrwr: dat archive instead? p2p mirrors instead of IA, and it supports incremental updates (as new files, changed files get re-pushed whole) [19:07] then you host a peer privately and your interested users can peer as well [20:29] @Vito` its done :0 https://datbase.org/beatsaver/BeatSaver-Data-Set [20:29] I like this protocol [20:29] since its updates it self [20:32] yeah so if your updates are just deltas or diffs as new files, I think peers won't have to redownload everything (I think if you replace json.tar and songs.tar they would) [20:33] Ya [20:34] also if your dat gets over 300GB the current implementations have issues last I heard [20:34] but they're aware of them and are fixing them [20:36] ya [20:54] *** godane has joined #archiveteam-ot [21:24] *** tuluu has quit IRC (Remote host closed the connection) [21:25] *** tuluu has joined #archiveteam-ot [21:39] *** BlueMax has joined #archiveteam-ot [21:50] *** schbirid has quit IRC (Remote host closed the connection) [22:48] *** adinbied has quit IRC (Read error: Operation timed out) [22:51] *** adinbied has joined #archiveteam-ot [23:43] What [23:44] What are the items