Time |
Nickname |
Message |
04:19
🔗
|
|
jmeyer has joined #internetarchive.bak |
07:11
🔗
|
iabak-reg |
03registrar 05master c001224 06other 10SHARD12/pubkeys registration of removed@gmail.com on SHARD12 |
07:41
🔗
|
|
atomotic has joined #internetarchive.bak |
08:00
🔗
|
|
atomotic has quit IRC (Remote host closed the connection) |
08:18
🔗
|
|
atomotic has joined #internetarchive.bak |
09:33
🔗
|
iabak-reg |
03registrar 05master 90dc5a0 06other 10SHARD7/pubkeys registration of jdamery+iabak on SHARD7 |
10:01
🔗
|
|
jmeyer has quit IRC (Quit: http://chat.efnet.org (Session timeout)) |
12:19
🔗
|
|
kyan has quit IRC (Remote host closed the connection) |
13:42
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
14:37
🔗
|
sep332_ |
the iabak-cronjob is taking a long time |
14:37
🔗
|
sep332_ |
I ran it overnight and it's 10% done |
14:42
🔗
|
db48x |
sep332_: what's the recent output look like? |
14:42
🔗
|
sep332_ |
remote: Resolving deltas: 10% (760769/7521255) |
14:42
🔗
|
sep332_ |
this is in shard4 |
14:44
🔗
|
db48x |
so it's syncing |
14:44
🔗
|
sep332_ |
yeah, the number is going up but very slowly |
14:45
🔗
|
sep332_ |
I just don't remember this taking all night before |
14:45
🔗
|
db48x |
no, it shouldn't |
14:45
🔗
|
Senji |
sync is running very slow for me too. Is the server overloaded? |
14:45
🔗
|
db48x |
probably |
14:57
🔗
|
|
atomotic has joined #internetarchive.bak |
15:13
🔗
|
|
jmeyer has joined #internetarchive.bak |
16:44
🔗
|
Senji |
This sync has been running for about 8 hours now :) |
17:08
🔗
|
|
atomotic has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…) |
17:31
🔗
|
|
atomotic has joined #internetarchive.bak |
17:48
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
18:47
🔗
|
|
beardicus has joined #internetarchive.bak |
19:38
🔗
|
sep332_ |
I'm at 24%! progress! |
19:40
🔗
|
Senji |
I'm at 5% :) |
20:10
🔗
|
closure |
load average: 22.25, 21.88, 22.02 |
20:13
🔗
|
|
jmeyer has quit IRC (Quit: http://chat.efnet.org (Session timeout)) |
20:15
🔗
|
closure |
seems disk bound |
20:47
🔗
|
closure |
server can't keep up with load anymore. Especially since it's not on a SSD any longer |
20:51
🔗
|
sep332_ |
ouch |
21:01
🔗
|
|
arkiver has quit IRC (ZNC - http://znc.in) |
21:01
🔗
|
|
joepie91 has quit IRC (ZNC - http://znc.in) |
21:19
🔗
|
sep332_ |
is checking in once per day too often now that we have 90 clients updating >350 shards? |
21:22
🔗
|
closure |
seems so |
21:22
🔗
|
closure |
350?! |
21:37
🔗
|
sep332_ |
http://iabak.archiveteam.org/stats/ALL.leaderboard-raw |
21:38
🔗
|
sep332_ |
well, 90 clients with ~4 shards each |
21:39
🔗
|
sep332_ |
whoops, seems the box is down now |
21:41
🔗
|
iabak-reg |
03registrar 05master df34cda 06other 10SHARD17/pubkeys registration of hcross on SHARD17 |
21:41
🔗
|
closure |
bounced it since it was not going to recover from that load. let's see if it overloads again soon |
21:44
🔗
|
closure |
it's getting 2000+ logins per day |
21:45
🔗
|
closure |
seems excessive for thaat many clones |
21:48
🔗
|
db48x |
that does seem like a lot |
21:48
🔗
|
db48x |
active clients log in every hour though |
21:49
🔗
|
closure |
interestingly SHARD1 is 1200 of the logins |
21:50
🔗
|
closure |
SHARD2 600, rest of the shards consistently <130 |
21:50
🔗
|
closure |
probably a lot more clients have shard1 than the rest |
21:51
🔗
|
db48x |
SHARD1 only has 5 unexpired contributors |
21:51
🔗
|
closure |
well that's wacky |
21:52
🔗
|
sep332_ |
my shard1 nearly expired but I think it got it in time. my others are expired as of yesterday though |
21:52
🔗
|
db48x |
sshd logs the fingerprint of the keys used to log in |
21:52
🔗
|
|
atomotic has joined #internetarchive.bak |
21:52
🔗
|
db48x |
sep332_: they can be revived |
21:55
🔗
|
sep332_ |
"git annex sync", right? |
21:59
🔗
|
db48x |
sep332_: mark your repository as semitrusted again first |
21:59
🔗
|
db48x |
git annex semitrust <uuid> |
22:00
🔗
|
db48x |
closure: some keys are used more frequently than others: |
22:00
🔗
|
db48x |
151 RSA SHA256:ML3P4G+t1masAvYrwgjhOInfx7kJS2lzen6pJt6DqfU |
22:00
🔗
|
db48x |
160 RSA SHA256:+FO1n6ewW+wtCQXnCcVddCyS5taghDbl18mbuVYqihc |
22:00
🔗
|
db48x |
161 RSA SHA256:pksUe/GxQa1sC41URT2SrsF5eYtTZ5HZoNC6MMFtTOA |
22:00
🔗
|
db48x |
164 RSA SHA256:Oq8ViROKxS2LbHmtoVkpA516sfoMLEjmxTyMcVqCgT8 |
22:00
🔗
|
db48x |
171 RSA SHA256:aMytbVBArPZBxAbINXXhbGlkez1ewYzdzdejXYGOwR0 |
22:00
🔗
|
db48x |
I used journalctl -b | grep 'Accepted publickey' | cut -d ':' -f 5,6 | sort | uniq -c | sort -h |
22:08
🔗
|
db48x |
hmm, I logged in 107 times this week |
22:12
🔗
|
closure |
I looked at the 171 one, and it's a user who has quite a lot of shards checked out, so I think is probably not doing anything wrong |
22:13
🔗
|
db48x |
looking at my own logins for SHARD1, it happens twice per day |
22:14
🔗
|
closure |
yes, same for the 171 login user for SHARD20 |
22:15
🔗
|
closure |
they're hitting most shards, which is fine if they have a lot of disk spacd |
22:16
🔗
|
closure |
it's not re-overloaded so far; my gut feeling is that it's on the edge of disk IO starvation and will sometimes tip over when cron jobs are running etc |
22:16
🔗
|
db48x |
ah, iabak-helper syncs all the shards first thing, then syncs each shard again at the end |
22:16
🔗
|
db48x |
each shard it handles |
22:17
🔗
|
closure |
the first sync is to learn what others have not gotten; the last is the report back what it's gotten |
22:17
🔗
|
* |
closure wonders why iotop is not working on the iabak server |
22:18
🔗
|
db48x |
yes, but the cronjob doesn't download anything else |
22:18
🔗
|
db48x |
it calls handleshard in CRONJOB mode, but handleshard can only ever skip it or fsck it |
22:18
🔗
|
closure |
ah, so only 1 sync needed then, I see |
22:19
🔗
|
closure |
well, that would probably keep the server from overloading immeditely |
22:19
🔗
|
|
mls has quit IRC (Read error: Connection reset by peer) |
22:19
🔗
|
closure |
new shards probably will need to live somewhere else |
22:24
🔗
|
db48x |
pushed a change for that |
22:24
🔗
|
closure |
it could also only run the fast fsck once per week, but always try to sync even when it didn't run it |
22:25
🔗
|
db48x |
I was thinking that the whole cronjob could just be run once per week |
22:25
🔗
|
closure |
well, problem with that is systems not always up, drives not always mounted etc |
22:26
🔗
|
db48x |
systemd will run the job if a week has passed |
22:26
🔗
|
closure |
if using systemd.. |
22:27
🔗
|
closure |
but it won't notice if the drive is only plugged in on Friday and it happens to run the job on mondays |
22:27
🔗
|
db48x |
true |
22:27
🔗
|
db48x |
we could include an automount that users who use removable drives could set up |
22:28
🔗
|
closure |
find -atime 1 |
22:32
🔗
|
sep332_ |
ok, all my shards [1-4] should be back. I'll check the graphs later tonight |
22:32
🔗
|
closure |
db48x: your commit seems to move the sync to before the download; it was after |
22:33
🔗
|
closure |
hmm, I am probably lost in the maze |
22:35
🔗
|
db48x |
I put it at the end of the download function |
22:35
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
22:43
🔗
|
closure |
pushed a change that should make the cron job only do the fsck and sync once a week |
22:44
🔗
|
closure |
it could probably be backed off to even 2 weeks |
22:58
🔗
|
sep332_ |
it sends an email if you don't check in in 2 weeks. so people might get a lot of emails |
23:32
🔗
|
|
joepie91 has joined #internetarchive.bak |