[00:00] *** kyan has joined #internetarchive.bak [00:23] 03registrar 05master 5e5f107 06other 10SHARD4/pubkeys registration of Kaz on SHARD4 [01:03] 03registrar 05master 4ad9d88 06other 10SHARD17/pubkeys registration of Kaz on SHARD17 [01:24] not all in the stats but I crossed 1T backed up today [01:25] *** db48x has quit IRC (Read error: Operation timed out) [01:47] *** Start has joined #internetarchive.bak [03:59] 03registrar 05master 49ad503 06other 10SHARD4/pubkeys registration of jaws2k12 on SHARD4 [04:08] *** sevs has joined #internetarchive.bak [04:25] *** kyan has quit IRC (Quit: Leaving) [05:46] 03registrar 05master 2c4262f 06other 10SHARD14/pubkeys registration of jaws2k12 on SHARD14 [05:50] SketchCow: not sure yet. i'd like some time to work out the shard deployment procedure [05:51] OK. [05:51] I'll write something but won't shoot it until everyone's comfortable. [05:52] in the meantime, ArchiveBot WARCs alone are something like ~30 shards worth [06:00] Yep, it's a party [06:00] They're also the most unique public-accessible parts of the archive [06:00] eh I dunno the software library is up there [06:01] Software library has plastic backups [06:01] archivebot has nothing like that [06:01] Hey, I want both [06:01] Just saying [06:01] ah yes [06:53] 03registrar 05master 835ef6c 06other 10SHARD15/pubkeys registration of jaws2k12 on SHARD15 [07:39] yipdw: 37 I think [07:42] HCross2: it should be 32 if you're using my splits [07:42] Ah yes [07:42] It was 37 on mine [08:17] *** db48x has joined #internetarchive.bak [08:32] 03registrar 05master 3031a28 06other 10SHARD10/pubkeys registration of deewiant+ia.bak on SHARD10 [09:58] * db48x yawns [10:02] hey, we have connection data again [10:02] maybe I should have bounced the carbon service [10:16] I'm afraid that iabak has gotten a lot slower than it used to be [10:17] too many people downloading? [10:18] db48x: the VM, or something larger-scoped [10:19] io is slower, so I guess it's the vm [10:19] there is definitely quite a bit of load from git pack-objects on these repos [10:20] yea, but it's all iowait [10:20] hm [10:21] actually, huh [10:21] how big is SHARD8? [10:21] git pack-objects is eating a tremendous proportion of RAM [10:21] 6.72 TB [10:21] Hmm, this bit of shard4 of mine is refusing to register [10:21] the repo is 6.72 TB, or is that repo + annexed objects [10:22] hmm [10:23] I wonder if --window-memory would help here [10:24] oh, we're okay agin [10:25] rss went down on that pack-objects [10:25] oh, because it's doing SHARD9 now :) [10:25] hmm yeah [10:25] lots of OOM killer activity too [10:26] i think we need to tune these pack-objects invocations [10:26] yea, quite probable [10:26] what's the rationale behind --window=250 and --depth=50? [10:27] well --window I guess [10:27] --depth seems default [10:27] unknown [10:30] oh [10:30] that may be what git gc --aggressive does [10:30] from shardmaint [10:31] could we uh just change that to git gc --auto [10:31] shards don't change often, so repacking shouldn't be needed every time [10:35] It took 468.28 seconds to enumerate untracked files. 'status -uno' [10:35] sheesh [10:37] I'll give that a try and see if it smoothes out our performance over time [10:40] next shardmaint run, anyway -- I don't want to interrupt this one [11:22] *** Kksmkrn has quit IRC (Ping timeout: 250 seconds) [12:08] alas [12:08] shardmaint is done, but still "It took 148.63 seconds to enumerate untracked files." [12:08] it did this in a couple of seconds before [12:36] cloning shards is really slow as well [12:43] *** db48x has quit IRC (Remote host closed the connection) [12:49] *** Kksmkrn has joined #internetarchive.bak [12:53] *** Kksmkrn has quit IRC (Quit: Now where did my session go?) [12:56] *** Kksmkrn has joined #internetarchive.bak [13:09] *** VADemon has joined #internetarchive.bak [13:11] yipdw db48x: VM is now on normal storage drives, previous SSD, but space was an issue, they aren't big SSDs [13:49] *** VADemon has quit IRC (Quit: left4dead) [13:53] *** db48x has joined #internetarchive.bak [14:22] *** kyan has joined #internetarchive.bak [15:28] *** Start has quit IRC (Quit: Disconnected.) [15:36] 03registrar 05master c281ea8 06other 10SHARD10/pubkeys registration of milenko on SHARD10 [15:59] did the server migrate [15:59] non-ssd disk.. [16:49] *** RKenshin has joined #internetarchive.bak [16:49] *** Kenshin has quit IRC (Read error: Operation timed out) [16:49] *** RKenshin is now known as Kenshin [16:50] *** svchfoo1 sets mode: +o Kenshin [17:02] *** atomotic has joined #internetarchive.bak [17:13] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [18:14] I've got an issue. I checksum at the end of each file download, but on ARM cores this takes an age. Can I move the checksum to the end of the downloading? [18:15] I've basically got a smartphone with a 6tb HDD attached [19:33] *** atomotic has joined #internetarchive.bak [19:58] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [20:07] HCross2: have you tried parallelizing the downloads? i wonder if that would allow for download to continue while checksum is happening [23:05] *** kyan has quit IRC (Remote host closed the connection) [23:14] 03registrar 05master a6caaf2 06other 10SHARD4/pubkeys registration of mail on SHARD4 [23:14] 03registrar 05master da31d34 06other 10SHARD15/pubkeys registration of mail on SHARD15 [23:36] *** sevs has quit IRC (Ping timeout: 268 seconds)