Time |
Nickname |
Message |
01:18
🔗
|
|
yipdw_ is now known as yipdw |
01:19
🔗
|
|
svchfoo1 sets mode: +o yipdw |
01:44
🔗
|
|
iten has joined #internetarchive.bak |
01:48
🔗
|
iten |
can I sign up for SHARD2@iabak.archiveteam.org:shard2? |
01:50
🔗
|
SketchCow |
Graphs broke |
01:51
🔗
|
db48x |
iten: of course! pastbin your key and I'll add it |
01:51
🔗
|
db48x |
SketchCow: could be my fault, or maybe nobody's connected |
01:51
🔗
|
iten |
ok, cool |
01:52
🔗
|
iten |
wait, pastebin or "paste in"? |
01:52
🔗
|
db48x |
pastebin is best |
01:55
🔗
|
iten |
ok, http://pastebin.com/sMqrbDMu |
01:55
🔗
|
SketchCow |
No, there's a problem |
01:55
🔗
|
SketchCow |
But closure will recognize |
01:58
🔗
|
db48x |
SketchCow: yes, I was being slightly facetious :) |
02:00
🔗
|
tpw_rules |
closure: i'm still getting one failure for shard1 |
02:02
🔗
|
tpw_rules |
oh. interesting. https://archive.org/download/Ttscribe/Ttscribe_files.xml has been darked |
02:05
🔗
|
tpw_rules |
so i'm complete and fscked except for that file |
02:05
🔗
|
db48x |
what's the failure? |
02:06
🔗
|
tpw_rules |
403 |
02:06
🔗
|
db48x |
ah |
02:06
🔗
|
db48x |
we'll just remove it from the collection |
02:06
🔗
|
tpw_rules |
why do they dark things? |
02:06
🔗
|
tpw_rules |
(and when will that be done) |
02:06
🔗
|
tpw_rules |
like are we gonna have problems having things they don't |
02:06
🔗
|
db48x |
for these items, it could be a dmca request or similar |
02:07
🔗
|
db48x |
yea, we'll end up with some files that were available for a while and then went dark, but that's not really a problem |
02:07
🔗
|
tpw_rules |
ok |
02:07
🔗
|
db48x |
well, not for us anyway |
02:07
🔗
|
tpw_rules |
when will that be removed so i can mark this as complete? |
02:08
🔗
|
db48x |
if we were backing up wayback machine stuff, then it could be a site that changed their robots.txt |
02:12
🔗
|
tpw_rules |
ima hold off the fsck until that is removed |
02:12
🔗
|
db48x |
no worries |
02:31
🔗
|
iten |
thanks! the server still isn't accepting my key though, should I just wait till it updates or something, or did I screw something up? |
02:31
🔗
|
|
beardicus has quit IRC (My MacBook Pro has gone to sleep. ZZZzzz…) |
02:37
🔗
|
db48x |
hrm! you're not the first to report that |
02:37
🔗
|
db48x |
what's your ip address? |
02:39
🔗
|
iten |
24.4.230.103 |
02:42
🔗
|
db48x |
Apr 07 22:30:28 ia-bak sshd[16070]: Connection closed by 24.4.230.103 [preauth] |
02:44
🔗
|
db48x |
no other details though |
02:45
🔗
|
iten |
yeah, all I get is 'Permission denied (publickey,password).' |
02:47
🔗
|
db48x |
just to double check, the key you gave me is the key that iabak generated for you? |
02:47
🔗
|
db48x |
it's stored in the id_rsa file in the same directory as iabak |
02:47
🔗
|
iten |
maybe tmux screwed up the copy/paste, let me double-check |
02:49
🔗
|
iten |
seems to be exactly the same (as in id_rsa.pub that is) |
02:50
🔗
|
iten |
the line previous to mine in the "pubkeys" file you committed starts with "sh-rsa" rather than "ssh-rsa" |
02:50
🔗
|
iten |
if that matters |
02:51
🔗
|
yipdw |
that does matter :) |
02:51
🔗
|
db48x |
heh, indeed |
02:51
🔗
|
db48x |
hater: that one's yours, so maybe that's the problem :) |
02:54
🔗
|
iten |
:) |
03:00
🔗
|
|
toad1 has joined #internetarchive.bak |
03:37
🔗
|
|
chfoo has quit IRC (Remote host closed the connection) |
03:40
🔗
|
|
chfoo has joined #internetarchive.bak |
03:41
🔗
|
|
svchfoo3 sets mode: +o chfoo |
04:28
🔗
|
db48x |
http://iabak.archiveteam.org:8080/render/?width=1060&height=731&_salt=1428467312.185&from=00%3A00_20150407&until=23%3A59_20150407&lineMode=connected&tz=UTC&logBase=&bgcolor=000000&majorGridLineColor=FFFFFF&fgcolor=FFFFFF&minorGridLineColor=C0C0C0&colorList=red%2Cgold%2Cgreen&lineWidth=2&target=alias%28iabak.shardstats.numcopies.0.shard2%2C%220%20copies%22%29&target=alias%28sumSeries%28iabak.shardstats.numcopies.1.shard2%2 |
04:31
🔗
|
yipdw |
heh, my system's still syncing |
04:31
🔗
|
yipdw |
72 GB in two days |
04:32
🔗
|
db48x |
oops, a bug |
04:33
🔗
|
db48x |
http://iabak.archiveteam.org:8080/render/?width=1060&height=731&_salt=1428467542.784&from=00%3A00_20150407&until=23%3A59_20150407&lineMode=connected&tz=UTC&logBase=&bgcolor=000000&majorGridLineColor=FFFFFF&fgcolor=FFFFFF&minorGridLineColor=C0C0C0&colorList=red%2Cgold%2Cgreen%2Cwhite&lineWidth=2&yMin=0&target=alias%28iabak.shardstats.numcopies.0.shard2%2C%220%20copies%22%29&target=alias%28sumSeries%28iabak.shardstats.numco |
04:38
🔗
|
db48x |
even better: |
04:38
🔗
|
db48x |
http://iabak.archiveteam.org:8080/render/?width=1060&height=731&_salt=1428467843.343&from=00%3A00_20150407&until=23%3A59_20150407&lineMode=staircase&tz=UTC&logBase=&bgcolor=000000&majorGridLineColor=FFFFFF&fgcolor=FFFFFF&minorGridLineColor=C0C0C0&colorList=red%2Cgold%2Cgreen%2Cwhite&lineWidth=2&yMin=0&target=legendValue%28alias%28keepLastValue%28iabak.shardstats.numcopies.0.shard2%29%2C%220%20copies%22%29%2C%22last%22%29& |
04:38
🔗
|
db48x |
make your own at http://iabak.archiveteam.org:8080 |
04:42
🔗
|
db48x |
time for me to sleep |
05:18
🔗
|
|
Control-S has joined #internetarchive.bak |
05:24
🔗
|
|
Ctrl-S has quit IRC (Read error: Operation timed out) |
05:24
🔗
|
|
Control-S is now known as Ctrl-S |
05:39
🔗
|
|
SN4T14_ has quit IRC (Ping timeout: 306 seconds) |
05:45
🔗
|
|
SN4T14 has joined #internetarchive.bak |
05:51
🔗
|
SketchCow |
Just for the record, there's a mass of reasons for darking. Including: |
05:51
🔗
|
SketchCow |
- Rightholder request |
05:51
🔗
|
|
zottelbey has joined #internetarchive.bak |
05:51
🔗
|
SketchCow |
- Item replaced by better item |
05:51
🔗
|
SketchCow |
- Mistake |
05:51
🔗
|
SketchCow |
- Temporarily stored away out of site until some problem is fixed |
05:51
🔗
|
SketchCow |
- Spam |
05:52
🔗
|
SketchCow |
- System file |
06:41
🔗
|
|
niyaje4 has joined #internetarchive.bak |
07:19
🔗
|
midas |
something went boom |
07:53
🔗
|
|
niyaje4 has quit IRC (Ping timeout: 600 seconds) |
08:40
🔗
|
|
ppiixx has joined #internetarchive.bak |
09:27
🔗
|
|
Senji is now known as Senji2 |
09:31
🔗
|
|
Senji has joined #internetarchive.bak |
09:31
🔗
|
|
Senji2 has quit IRC (leaving) |
12:05
🔗
|
|
atomotic has joined #internetarchive.bak |
12:27
🔗
|
hater |
db48x: that solved my problem ;) thank you |
12:27
🔗
|
db48x |
hater: you're welcome. iten spotted it; too bad we haven't figured out why _his_ key isn't working :P |
12:28
🔗
|
* |
hater is laughing |
12:30
🔗
|
hater |
damn ssh ^ |
12:30
🔗
|
db48x |
might be similar to why your previous key didn't work; too bad we don't know why that didn't work |
12:44
🔗
|
|
beardicus has joined #internetarchive.bak |
12:48
🔗
|
|
Atluxity has joined #internetarchive.bak |
13:00
🔗
|
hater |
db48x: install-git-annex gives an error if there is no internet connection |
13:01
🔗
|
hater |
replacing line 8 with this solves the problem: if [ -n "$newVersion" ] && [ "$installedVersion" -lt "$newVersion" ] |
13:02
🔗
|
hater |
i'm too lazy to update my fork, create a branch and then pull a request |
14:18
🔗
|
midas |
the connected users file changed or isnt working anymore, thats why the map is empty btw |
14:23
🔗
|
Senji |
'cor these fscks are ...slow... |
14:35
🔗
|
SketchCow |
Ok, so, let's quickly figure out why the graphs page went flat-a-roo |
14:38
🔗
|
SketchCow |
closure: The stats.tar.gz that is downloaded from iabak has zero-length geolists. |
14:38
🔗
|
SketchCow |
db48x: Unless you're generating the files now. |
15:05
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
15:09
🔗
|
db48x |
I touched the script that generates them |
15:09
🔗
|
db48x |
it works when I run it myself |
15:10
🔗
|
db48x |
let's see if I can figure it out now that I've slept and eaten and taken a pain killer |
15:14
🔗
|
db48x |
yep, I see the problem :) |
15:14
🔗
|
SketchCow |
You did it, you killed it |
15:15
🔗
|
SketchCow |
Blood on your hands |
15:15
🔗
|
SketchCow |
mixing with your already extent blood |
15:15
🔗
|
db48x |
yep |
15:15
🔗
|
db48x |
https://github.com/db48x/IA.BAK/commit/61774ca0a798c8f079a7ac05dcc3542da2451fb3#diff-98514952a623b3aeedab83f2e5c6d08eR85 |
15:16
🔗
|
db48x |
I saw this last night, but didn't think it was the cause |
15:16
🔗
|
db48x |
just noted it as something else I needed to fix |
15:17
🔗
|
Senji |
My fast fsck took 3h7m; the non-fast fsck is still running (about 1h30 in). I don't have a huge amount of shard1 though so I don't have any idea how that figure would scale if I had more |
15:19
🔗
|
db48x |
Senji: 3 hours isn't very fast! |
15:20
🔗
|
sep332 |
Senji: how much do you have? |
15:20
🔗
|
Senji |
Indeed, and it looks as if it's IO bound, not CPU bound (the machine it's running on isn't very beefy; so CPU boundness could be a problem) |
15:21
🔗
|
Senji |
sep332: if I run du on shard1 now that's going to adversely affect the normal-fsck time :) |
15:21
🔗
|
db48x |
heh, probably not |
15:22
🔗
|
db48x |
either the data du wants is already in your disk cache, in which case it's free, or it's data that fsck will eventually want, in which case it's free |
15:22
🔗
|
Senji |
I think I have about a tenth of it though |
15:22
🔗
|
db48x |
git annex find --in=here will list all the files you have |
15:30
🔗
|
Senji |
db48x: that's only true if I have enough buffer cache available :-). I'm a bit disbelieving that things will stay in buffer cache for 3 hours :) |
15:32
🔗
|
db48x |
quite possibly true :) |
15:35
🔗
|
Senji |
Presumably I should avoid downloading any particular shard onto more than one machine; because otherwise I might end up with two copies of the same file (which, on machines less than a metre apart might be a bad idea) |
15:39
🔗
|
db48x |
yea, that's a risk |
15:39
🔗
|
db48x |
you can sync the file lists manually if you want, though |
15:41
🔗
|
db48x |
use git annex find --in=here on one, then use git annex drop on the other to drop that list of files |
15:41
🔗
|
Senji |
I've still got quite a bit of space to fill up on this machine; I'll just wait until shard3 for the other one :-) |
15:41
🔗
|
db48x |
hmm |
15:42
🔗
|
db48x |
in fact, I bet you could set up a rule to make each repository fail to want anything which specific other respositories already have |
15:43
🔗
|
db48x |
git annex wanted --not --in=otherrepo |
15:44
🔗
|
hater |
db48x: how's your hand? |
15:46
🔗
|
db48x |
nothing's fallen off, and I can type if I've taken something for the pain recently |
15:46
🔗
|
db48x |
SketchCow: yay :) http://iabak.archiveteam.org/stats/ALL.geolist |
15:48
🔗
|
Senji |
I think you've got my "city" there from my ISP's name (which is "Andrews & Arnold") :-) |
15:48
🔗
|
db48x |
heh |
15:48
🔗
|
db48x |
blame freegeoip.net |
15:52
🔗
|
closure |
ok, sounds like we should run fast fsck less frequently |
15:52
🔗
|
closure |
although on an ssd, I can fsck in around 12 minutes I think |
15:52
🔗
|
Senji |
closure: I can't imagine my disks are that much slower; maybe it is CPU-bound and just not when I'm looking at top |
15:53
🔗
|
closure |
seek-bound |
16:03
🔗
|
* |
db48x sighs |
16:04
🔗
|
db48x |
I should tweak shardstats so that it doesn't have to actually call git annex info every time I want to test it |
16:23
🔗
|
|
Start-mob has joined #internetarchive.bak |
16:26
🔗
|
|
Start sets mode: +o Start-mob |
16:40
🔗
|
|
Start has quit IRC (Disconnected.) |
16:40
🔗
|
Senji |
I have 10222 files in shard1 |
16:55
🔗
|
|
Start-mob has quit IRC (Remote host closed the connection) |
16:55
🔗
|
|
Start-mob has joined #internetarchive.bak |
16:55
🔗
|
|
svchfoo1 sets mode: +o Start-mob |
17:00
🔗
|
midas |
db48x: it's scary how close it is to where i really am |
17:01
🔗
|
midas |
closure: fsck fast? i can test that on my dataset to see how fast it goes |
17:02
🔗
|
|
Start-mob has quit IRC (Ping timeout: 370 seconds) |
17:05
🔗
|
midas |
ah iabak runs that |
17:06
🔗
|
db48x |
midas: there's a reason we call them ICBM coordinates ;) |
17:07
🔗
|
midas |
aye :p |
17:07
🔗
|
midas |
freegeoip just sank my battleship |
17:10
🔗
|
midas |
db48x: do all pubkeys for shard1 also have access to shard2? |
17:14
🔗
|
db48x |
not automatically, but I've been adding keys to both at the same time |
17:37
🔗
|
Senji |
So, here's my fast fsck vs fsck stats: http://pastebin.ubuntu.com/10774937/ |
17:42
🔗
|
SketchCow |
Looks like the graphs work again - thanks, db48x |
17:46
🔗
|
db48x |
SketchCow: you're welcome |
17:50
🔗
|
db48x |
we should be doing >&- 2>&- there, rather than /dev/null |
17:51
🔗
|
db48x |
shouldn't have a huge effect on the time though |
17:52
🔗
|
db48x |
oops, infinite loop :P |
17:56
🔗
|
db48x |
protip: always increment your loop variable |
17:56
🔗
|
sep332 |
if it's worth looping, it's worth looping forever, right? |
17:58
🔗
|
db48x |
:) |
17:58
🔗
|
sep332 |
one minute you want a loop, the next minute you don't... just make up your mind |
18:04
🔗
|
SketchCow |
The big question is if the "connected users" are meant to no longer be sawtooth" |
18:07
🔗
|
db48x |
SketchCow: hard to say if it was ever mean to be a sawtooth in the first place :) |
18:11
🔗
|
db48x |
maybe someone changed the munin configuration? I haven't |
18:38
🔗
|
hater |
db48x: are you working on git-annex? |
18:41
🔗
|
ersi |
ask what you want to ask instead |
18:41
🔗
|
hater |
running more iabak-instances at once results in a very high cpu-load |
18:41
🔗
|
Senji |
ask, don't ask to ask :-) |
18:41
🔗
|
Senji |
hater: well, of course |
18:42
🔗
|
hater |
if the parallel feature in git-annex is coming soon, there will be no need to open a issue ticket |
18:42
🔗
|
Senji |
'cos that's what cpu-load *measures* |
18:42
🔗
|
hater |
fastfsck is (when running multiple instances) more like a bug than a feature |
18:43
🔗
|
Senji |
I think I convinced closure earlier that fast fsck wants to run less often :) |
18:43
🔗
|
hater |
oh |
18:44
🔗
|
|
Start has joined #internetarchive.bak |
18:44
🔗
|
hater |
(and fastfsck throws an error after some time because there are other fsck-instances running(and locking)) |
18:45
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
18:45
🔗
|
|
Start has joined #internetarchive.bak |
18:45
🔗
|
Senji |
Also the hourly git updates tend to squash each other |
18:45
🔗
|
db48x |
I haven't really done any work on git annex itself |
18:45
🔗
|
db48x |
I've perused the source a little |
18:45
🔗
|
ersi |
joey however, although he's not in here |
18:46
🔗
|
db48x |
Senji: squash? |
18:46
🔗
|
hater |
i think we should write down the issues or some of them will get lost during development |
18:46
🔗
|
db48x |
github tickets work well enough at this scale :) |
18:47
🔗
|
Senji |
db48x: if you start 10 copies of iabak now then in an hour it runs 10 sets of the git update; and they either run very slow or some of them fail |
18:47
🔗
|
db48x |
mmm, very good point |
18:47
🔗
|
hater |
Senji: is simple .file-lock should work |
18:48
🔗
|
db48x |
it's not super critical that every sync succeed, but we should only sync each shard once per hour, no matter how often you start iabak |
18:48
🔗
|
Senji |
with-lock-ex is often your friend :) |
18:55
🔗
|
|
Start has quit IRC (Disconnected.) |
19:07
🔗
|
|
Start has joined #internetarchive.bak |
19:23
🔗
|
sep332 |
the only reason to run iabak in parallel is to download faster, right? |
19:23
🔗
|
sep332 |
so parellelizing git-annex get would do that |
19:24
🔗
|
sep332 |
and you'd only need to run iabak once, maybe with a flag for how many downloads to do at once |
19:25
🔗
|
|
Start has quit IRC (Disconnected.) |
19:42
🔗
|
SketchCow |
My next question is if shard2's growth/downloads are being accurately reflected. |
19:42
🔗
|
SketchCow |
http://iabackup.archiveteam.org/ia.bak/SHARD2 |
19:42
🔗
|
SketchCow |
That seems really, really slow |
19:46
🔗
|
|
Start-mob has joined #internetarchive.bak |
19:46
🔗
|
|
svchfoo1 sets mode: +o Start-mob |
19:49
🔗
|
|
SN4T14_ has joined #internetarchive.bak |
19:53
🔗
|
|
SN4T14 has quit IRC (Ping timeout: 306 seconds) |
19:59
🔗
|
|
Start-mob has quit IRC (Remote host closed the connection) |
20:01
🔗
|
|
atomotic has joined #internetarchive.bak |
20:03
🔗
|
atomotic |
hi there. i could run iabak |
20:04
🔗
|
atomotic |
who can give me access to SHARD2@iabak.archiveteam.org ? |
20:05
🔗
|
ersi |
Hang around and you'll get it |
20:07
🔗
|
atomotic |
ok |
20:09
🔗
|
|
Start-mob has joined #internetarchive.bak |
20:09
🔗
|
|
svchfoo3 sets mode: +o Start-mob |
20:17
🔗
|
closure |
Senji: hey, could you try this: time git annex fsck --fast --in here |
20:18
🔗
|
closure |
my theory is that might be enough faster to use it |
20:18
🔗
|
|
Start-mob has quit IRC (Leaving) |
20:18
🔗
|
|
Start-mob has joined #internetarchive.bak |
20:19
🔗
|
|
svchfoo1 sets mode: +o Start-mob |
20:19
🔗
|
Senji |
Trying that |
20:21
🔗
|
Senji |
If it takes as long as earlier I'll be asleep when it finishes though :) |
20:21
🔗
|
closure |
here, it took 2 minutes |
20:22
🔗
|
closure |
vs 12 before |
20:22
🔗
|
closure |
will depend on the amout of files you have |
20:22
🔗
|
sep332 |
closure: can you add atomotic? |
20:25
🔗
|
closure |
atomotic: msg me your key |
20:25
🔗
|
closure |
db48x: oh, you de-sawtoothed it? How? I didn't understand why it was doing that in the munin graph |
20:29
🔗
|
closure |
SketchCow: SHARD2 stats match what I see if I run "git annex info ." in a clone of SHARD2 |
20:29
🔗
|
closure |
you can try that at home :) |
20:29
🔗
|
closure |
I think that some people were tearing through, but their disks are full now |
20:30
🔗
|
yipdw |
slow and steady wins the race; can't lose with Comcast |
20:30
🔗
|
closure |
ah, good, we caught back up on SHARD1 with those 60k files I had to redo |
20:30
🔗
|
closure |
hah |
20:31
🔗
|
yipdw |
I think I'm about 80 gigs into shard1 |
20:31
🔗
|
yipdw |
at this point we'd have filled up pre-doubler Johnny Mnemonic |
20:32
🔗
|
|
Start has joined #internetarchive.bak |
20:32
🔗
|
atomotic |
keep free> 1TB |
20:32
🔗
|
atomotic |
numfmt: invalid suffix in input ‘1TB’: ‘B’ |
20:32
🔗
|
atomotic |
./iabak-helper: 101: [: -lt: argument expected |
20:33
🔗
|
closure |
try tb, I think? |
20:33
🔗
|
|
Start-mob has quit IRC (Leaving) |
20:34
🔗
|
|
Start-mob has joined #internetarchive.bak |
20:34
🔗
|
|
svchfoo1 sets mode: +o Start-mob |
20:35
🔗
|
closure |
hmm, this numfmt thing does not seem to work for me either |
20:36
🔗
|
yipdw |
maybe just 1T |
20:36
🔗
|
closure |
aha, it only accepts "1T" |
20:36
🔗
|
|
Start-mob has quit IRC (Remote host closed the connection) |
20:37
🔗
|
SketchCow |
closure: So it SOUNDS like I should scare up a few more people. |
20:37
🔗
|
atomotic |
btw it started and is running |
20:37
🔗
|
SketchCow |
I wish we knew how much "committed space" there is - i.e. how much space people have, and how much they're using. |
20:37
🔗
|
closure |
SketchCow: still have some settling in on shard1, need to get fscking and expiry going |
20:37
🔗
|
SketchCow |
so say it's a 1gb shard, and three people with 500gb drives are helping. |
20:37
🔗
|
closure |
well, we know how much space people are using (it's a bit expensive to query it though) |
20:38
🔗
|
SketchCow |
And we know, then, that there's 3x499gb of "unused" |
20:38
🔗
|
Senji |
I have about another TB on my current machine; and another TB on another machine. |
20:38
🔗
|
SketchCow |
Like, the All-shard is currently 6.23tb. Of our gang of 31/32, how much more space is there? |
20:38
🔗
|
closure |
yeah, I hear you on unused. Hmm, could do something with metadata to record that in git |
20:39
🔗
|
Senji |
And maybe a third terabyte if I pull out some old disks :) |
20:39
🔗
|
SketchCow |
Well, right now, we know that 32 clients have Shard 1, and nine have Shard 2. |
20:39
🔗
|
SketchCow |
Maybe we need to ask people who are backing up Shard 1 to add in Shard 2 if they haven't. |
20:40
🔗
|
SketchCow |
And we need a check to say "hey, you're keeping a second copy of same shard, sorry, no" |
20:44
🔗
|
sep332 |
is there a way to tell git-annex to avoid files that are already on another box? |
20:44
🔗
|
sep332 |
If I have Box A and Box B right next to each other, can I tell them to get different files? |
20:44
🔗
|
closure |
sep332: yes, you can. |
20:44
🔗
|
SketchCow |
Yes. |
20:44
🔗
|
closure |
one way is to write down that uuid and say "git annex get --not --in $uuid" |
20:45
🔗
|
closure |
another way is to set up a git remote from one box, pointing at the repo in the other, and then you can say "git annex get --not --in $remote" |
20:45
🔗
|
sep332 |
the uuid one looks perfect, thanks |
20:46
🔗
|
closure |
(you'll still want --not --copies 4 too) |
20:46
🔗
|
closure |
you can also move files between repos, etc |
20:48
🔗
|
closure |
btw you get the uuid from git config annex.uuid |
20:49
🔗
|
sep332 |
for multiples, do I need --not --in $uuidA --not --in $uuidB |
20:49
🔗
|
sep332 |
or is that too many --not's |
20:53
🔗
|
closure |
as many as you like |
20:54
🔗
|
|
atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…) |
20:54
🔗
|
|
zottelbey has quit IRC (Remote host closed the connection) |
20:56
🔗
|
|
atomotic has joined #internetarchive.bak |
21:04
🔗
|
closure |
joey@beaver:~/lib/backup/IA.BAK/shard1>time ../git-annex.linux/git-annex fsck --in . --fast --quiet |
21:04
🔗
|
closure |
Bad file size (67 B larger); moved to .git/annex/bad/MD5-s2038--c09a81cc07e2ab2d0592d7aed9feaacb |
21:04
🔗
|
closure |
that's why it's good to run these fscks! |
21:04
🔗
|
closure |
probably http resume failure.. |
21:06
🔗
|
|
acridAxid has quit IRC (Quit: Quitting) |
21:09
🔗
|
|
acridAxid has joined #internetarchive.bak |
21:09
🔗
|
|
svchfoo3 sets mode: +o acridAxid |
21:09
🔗
|
closure |
real 4m25.007s |
21:10
🔗
|
closure |
that's on a spinning disk, so not very bad for the fast fsck |
21:10
🔗
|
Senji |
I think my machine must just bee too slow :) |
21:11
🔗
|
Senji |
It's still going on its fast fsck; 52mins in :) |
21:11
🔗
|
closure |
what kind of disk bus and filesystem is it? |
21:12
🔗
|
Senji |
SATA / RAID-1 / ext3 (with dir_index) |
21:12
🔗
|
Senji |
I think they're 7200 rpm disks |
21:12
🔗
|
closure |
hmm, mine is SATA, ext4 |
21:13
🔗
|
closure |
not over nfs or something is it? |
21:13
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
21:14
🔗
|
Senji |
No, that'd be horrible :) |
21:14
🔗
|
|
atomotic has joined #internetarchive.bak |
21:14
🔗
|
Senji |
She Who Must Be Obeyed calls me away from the computer. |
21:21
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
21:21
🔗
|
|
Start has joined #internetarchive.bak |
21:23
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
21:23
🔗
|
|
Start has joined #internetarchive.bak |
21:25
🔗
|
SketchCow |
-------------------------------------------------------------------- |
21:25
🔗
|
SketchCow |
If you're running a client for just Shard 1, consider doing one for |
21:25
🔗
|
SketchCow |
Shard 2. We don't want to expand the test client pool too much and |
21:26
🔗
|
SketchCow |
hey, you got it working once. |
21:26
🔗
|
closure |
iabak will automatically switch over to shard2 when it's run already |
21:26
🔗
|
SketchCow |
-------------------------------------------------------------------- |
21:26
🔗
|
|
Start-mob has joined #internetarchive.bak |
21:26
🔗
|
SketchCow |
you say that and yet the clients do not match |
21:26
🔗
|
SketchCow |
I see 32 on one, 9 on the other |
21:26
🔗
|
closure |
not everyone is running the script repeatedly |
21:26
🔗
|
SketchCow |
Maybe they need to know about a command |
21:26
🔗
|
SketchCow |
That counts |
21:26
🔗
|
|
svchfoo2 sets mode: +o Start-mob |
21:26
🔗
|
SketchCow |
That counts as someone going "OK, how do I do that" |
21:26
🔗
|
SketchCow |
And you go "run the thing" |
21:27
🔗
|
SketchCow |
I didn't say the tech support was hard! |
21:27
🔗
|
|
niyaje4 has joined #internetarchive.bak |
21:38
🔗
|
|
atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) |
21:42
🔗
|
|
Start-mob has quit IRC (Ping timeout: 370 seconds) |
21:46
🔗
|
|
Start has quit IRC (Read error: Connection reset by peer) |
21:46
🔗
|
|
Start_ has joined #internetarchive.bak |
21:51
🔗
|
|
Start_ is now known as Start |
22:23
🔗
|
|
Start has quit IRC (Disconnected.) |
22:27
🔗
|
|
Start-mob has joined #internetarchive.bak |
22:30
🔗
|
|
niyaje4 has quit IRC (Read error: Operation timed out) |
22:34
🔗
|
|
Start-mob has quit IRC (Remote host closed the connection) |
22:47
🔗
|
Senji |
With --in here: 80 minutes: so more than twice as fast |
22:52
🔗
|
closure |
better than nothing, but .. |
22:52
🔗
|
closure |
if it's that slow for others, let's find a different way |
22:54
🔗
|
closure |
if you want to hack your script locally, you can make it fsck --fast a single file or small directory. This still prevents your repo getting expired due to inactivity |
22:55
🔗
|
|
Start has joined #internetarchive.bak |
22:56
🔗
|
closure |
it'll still do a full fsck monthly |
22:56
🔗
|
|
svchfoo3 sets mode: +o Start |
22:57
🔗
|
Senji |
Once I've mostly filled up my space on this machine it shouldn't be a problem -- and the lock so it only fscks once at a time helps a lot with running multiple copies |
22:58
🔗
|
closure |
wait.. is your machine running other iabak's concurrently? |
22:58
🔗
|
Senji |
Not while I was testing |
22:59
🔗
|
closure |
ok, ok |
22:59
🔗
|
Senji |
I'm just getting it to run 10 in parallel now; 'cos bandwidth is a lot cheaper overnight :-) |
23:00
🔗
|
closure |
I really should finish git annex get -jN |
23:00
🔗
|
closure |
I have everything except one little peice, and the progress bar library still needs a lot of work |
23:00
🔗
|
closure |
maybe by monday :) |
23:05
🔗
|
closure |
aha |
23:05
🔗
|
db48x |
yea |
23:06
🔗
|
db48x |
do the timestamps in this file have to be in this format? |
23:07
🔗
|
closure |
that was simply the first half-way reasonable thing I came up with |
23:07
🔗
|
closure |
and I broke the hostname, doh |
23:07
🔗
|
db48x |
I'd like to do journalctl --unit=ssh.service --format=short-iso --utc |
23:09
🔗
|
|
Start-mob has joined #internetarchive.bak |
23:09
🔗
|
closure |
well, I can turn on persistent journaling |
23:09
🔗
|
db48x |
then it can just use cut instead of perl :) |
23:11
🔗
|
db48x |
oops, forgot about zgrep |
23:12
🔗
|
|
wp494_ has joined #internetarchive.bak |
23:15
🔗
|
|
wp494 has quit IRC (Ping timeout: 740 seconds) |
23:35
🔗
|
beardicus |
starting a fast fsck of shard1 now... |
23:36
🔗
|
beardicus |
oh, done already... was actually about 4 minutes. |
23:41
🔗
|
beardicus |
so i don't show up on the map. i don't feel sufficienty appreciated. is that code still seekrit? |
23:45
🔗
|
beardicus |
durr. nevermind... see the server branch. |