#internetarchive.bak 2016-11-24,Thu

↑back Search

Time Nickname Message
00:00 🔗 kyan has joined #internetarchive.bak
00:23 🔗 iabak-reg 03registrar 05master 5e5f107 06other 10SHARD4/pubkeys registration of Kaz on SHARD4
01:03 🔗 iabak-reg 03registrar 05master 4ad9d88 06other 10SHARD17/pubkeys registration of Kaz on SHARD17
01:24 🔗 thelsdj not all in the stats but I crossed 1T backed up today
01:25 🔗 db48x has quit IRC (Read error: Operation timed out)
01:47 🔗 Start has joined #internetarchive.bak
03:59 🔗 iabak-reg 03registrar 05master 49ad503 06other 10SHARD4/pubkeys registration of jaws2k12 on SHARD4
04:08 🔗 sevs has joined #internetarchive.bak
04:25 🔗 kyan has quit IRC (Quit: Leaving)
05:46 🔗 iabak-reg 03registrar 05master 2c4262f 06other 10SHARD14/pubkeys registration of jaws2k12 on SHARD14
05:50 🔗 yipdw SketchCow: not sure yet. i'd like some time to work out the shard deployment procedure
05:51 🔗 SketchCow OK.
05:51 🔗 SketchCow I'll write something but won't shoot it until everyone's comfortable.
05:52 🔗 yipdw in the meantime, ArchiveBot WARCs alone are something like ~30 shards worth
06:00 🔗 SketchCow Yep, it's a party
06:00 🔗 SketchCow They're also the most unique public-accessible parts of the archive
06:00 🔗 yipdw eh I dunno the software library is up there
06:01 🔗 SketchCow Software library has plastic backups
06:01 🔗 SketchCow archivebot has nothing like that
06:01 🔗 SketchCow Hey, I want both
06:01 🔗 SketchCow Just saying
06:01 🔗 yipdw ah yes
06:53 🔗 iabak-reg 03registrar 05master 835ef6c 06other 10SHARD15/pubkeys registration of jaws2k12 on SHARD15
07:39 🔗 HCross2 yipdw: 37 I think
07:42 🔗 yipdw HCross2: it should be 32 if you're using my splits
07:42 🔗 HCross2 Ah yes
07:42 🔗 HCross2 It was 37 on mine
08:17 🔗 db48x has joined #internetarchive.bak
08:32 🔗 iabak-reg 03registrar 05master 3031a28 06other 10SHARD10/pubkeys registration of deewiant+ia.bak on SHARD10
09:58 🔗 * db48x yawns
10:02 🔗 db48x hey, we have connection data again
10:02 🔗 db48x maybe I should have bounced the carbon service
10:16 🔗 db48x I'm afraid that iabak has gotten a lot slower than it used to be
10:17 🔗 Senji too many people downloading?
10:18 🔗 yipdw db48x: the VM, or something larger-scoped
10:19 🔗 db48x io is slower, so I guess it's the vm
10:19 🔗 yipdw there is definitely quite a bit of load from git pack-objects on these repos
10:20 🔗 db48x yea, but it's all iowait
10:20 🔗 yipdw hm
10:21 🔗 yipdw actually, huh
10:21 🔗 yipdw how big is SHARD8?
10:21 🔗 yipdw git pack-objects is eating a tremendous proportion of RAM
10:21 🔗 db48x 6.72 TB
10:21 🔗 Senji Hmm, this bit of shard4 of mine is refusing to register
10:21 🔗 yipdw the repo is 6.72 TB, or is that repo + annexed objects
10:22 🔗 yipdw hmm
10:23 🔗 yipdw I wonder if --window-memory would help here
10:24 🔗 yipdw oh, we're okay agin
10:25 🔗 db48x rss went down on that pack-objects
10:25 🔗 db48x oh, because it's doing SHARD9 now :)
10:25 🔗 yipdw hmm yeah
10:25 🔗 yipdw lots of OOM killer activity too
10:26 🔗 yipdw i think we need to tune these pack-objects invocations
10:26 🔗 db48x yea, quite probable
10:26 🔗 yipdw what's the rationale behind --window=250 and --depth=50?
10:27 🔗 yipdw well --window I guess
10:27 🔗 yipdw --depth seems default
10:27 🔗 db48x unknown
10:30 🔗 yipdw oh
10:30 🔗 yipdw that may be what git gc --aggressive does
10:30 🔗 yipdw from shardmaint
10:31 🔗 yipdw could we uh just change that to git gc --auto
10:31 🔗 yipdw shards don't change often, so repacking shouldn't be needed every time
10:35 🔗 db48x It took 468.28 seconds to enumerate untracked files. 'status -uno'
10:35 🔗 db48x sheesh
10:37 🔗 yipdw I'll give that a try and see if it smoothes out our performance over time
10:40 🔗 yipdw next shardmaint run, anyway -- I don't want to interrupt this one
11:22 🔗 Kksmkrn has quit IRC (Ping timeout: 250 seconds)
12:08 🔗 db48x alas
12:08 🔗 db48x shardmaint is done, but still "It took 148.63 seconds to enumerate untracked files."
12:08 🔗 db48x it did this in a couple of seconds before
12:36 🔗 db48x cloning shards is really slow as well
12:43 🔗 db48x has quit IRC (Remote host closed the connection)
12:49 🔗 Kksmkrn has joined #internetarchive.bak
12:53 🔗 Kksmkrn has quit IRC (Quit: Now where did my session go?)
12:56 🔗 Kksmkrn has joined #internetarchive.bak
13:09 🔗 VADemon has joined #internetarchive.bak
13:11 🔗 Kenshin yipdw db48x: VM is now on normal storage drives, previous SSD, but space was an issue, they aren't big SSDs
13:49 🔗 VADemon has quit IRC (Quit: left4dead)
13:53 🔗 db48x has joined #internetarchive.bak
14:22 🔗 kyan has joined #internetarchive.bak
15:28 🔗 Start has quit IRC (Quit: Disconnected.)
15:36 🔗 iabak-reg 03registrar 05master c281ea8 06other 10SHARD10/pubkeys registration of milenko on SHARD10
15:59 🔗 closure did the server migrate
15:59 🔗 closure non-ssd disk..
16:49 🔗 RKenshin has joined #internetarchive.bak
16:49 🔗 Kenshin has quit IRC (Read error: Operation timed out)
16:49 🔗 RKenshin is now known as Kenshin
16:50 🔗 svchfoo1 sets mode: +o Kenshin
17:02 🔗 atomotic has joined #internetarchive.bak
17:13 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
18:14 🔗 HCross2 I've got an issue. I checksum at the end of each file download, but on ARM cores this takes an age. Can I move the checksum to the end of the downloading?
18:15 🔗 HCross2 I've basically got a smartphone with a 6tb HDD attached
19:33 🔗 atomotic has joined #internetarchive.bak
19:58 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
20:07 🔗 thelsdj HCross2: have you tried parallelizing the downloads? i wonder if that would allow for download to continue while checksum is happening
23:05 🔗 kyan has quit IRC (Remote host closed the connection)
23:14 🔗 iabak-reg 03registrar 05master a6caaf2 06other 10SHARD4/pubkeys registration of mail on SHARD4
23:14 🔗 iabak-reg 03registrar 05master da31d34 06other 10SHARD15/pubkeys registration of mail on SHARD15
23:36 🔗 sevs has quit IRC (Ping timeout: 268 seconds)

irclogger-viewer