Time |
Nickname |
Message |
00:07
🔗
|
|
patricko- is now known as patrickod |
00:10
🔗
|
|
patrickod is now known as patricko- |
00:45
🔗
|
closure |
tpw_rules: git annex info |
00:46
🔗
|
tpw_rules |
ugh i'm running 16 at once and still having trouble pegging my internet |
00:47
🔗
|
|
ohhdemgir (~ohhdemgir@[redacted]) has joined #internetarchive.bak |
00:50
🔗
|
GitHub90/#internetarchive.bak |
IA.BAK/server 0533231 Joey Hess: fix html directory for stats |
00:50
🔗
|
GitHub90/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/jhIx |
00:51
🔗
|
SketchCow |
Well, that explain that |
00:52
🔗
|
closure |
indeed |
00:52
🔗
|
SketchCow |
Oh shit son, we have +4's |
00:52
🔗
|
closure |
and I see there are now a few files replicated a +4 .. |
00:52
🔗
|
closure |
ys |
00:52
🔗
|
closure |
+4 is the current target |
00:52
🔗
|
closure |
hrm, or is it +3? |
00:53
🔗
|
closure |
anyway, might get a few over the target due to collisions |
00:53
🔗
|
|
londoncal has quit (Quit: Leaving...) |
00:56
🔗
|
SketchCow |
So, some questions, likely some unanswerable. |
00:56
🔗
|
SketchCow |
Can you, as the poobah, see how much disk space is out there among the clients? |
00:56
🔗
|
SketchCow |
(I mean amount that is filling, not how much is out there and reported back. Obviously that's what that list is.) |
00:56
🔗
|
closure |
the data is there, it would take a little bit of calculation |
00:56
🔗
|
closure |
oh, you mean the unused space too? No |
00:57
🔗
|
SketchCow |
What I mean, to be specific, is to go "we have X tb in total, which we are now filling." |
00:57
🔗
|
closure |
we could add more reporting when the clients report back, but git-annex doesn't track that anyway |
00:58
🔗
|
ohhdemgir |
well, this project is radged! I'm in |
01:00
🔗
|
|
SN4T14_ (~SN4T14@[redacted]) has joined #internetarchive.bak |
01:00
🔗
|
|
SN4T14_ is now known as SN4T14 |
01:00
🔗
|
closure |
SketchCow: oh, we could add a client count though |
01:01
🔗
|
closure |
currently 13 |
01:01
🔗
|
SketchCow |
Well, I want that listing out there we talked about, so I can make a map. Go ahead and do it and let me know the URL |
01:01
🔗
|
SketchCow |
Have it update, oh, once a day |
01:01
🔗
|
SketchCow |
Or maybe more for now, while more people are joining |
01:01
🔗
|
closure |
hmm, I'm uncomfortable putting up IP addresses on http |
01:01
🔗
|
SketchCow |
Well, if you'd like, YOU could do the calculations, you're a genius |
01:02
🔗
|
closure |
I could rsync them to teamarchive1, or .. |
01:02
🔗
|
SketchCow |
And then give me the rough names it shoots out |
01:02
🔗
|
closure |
ugh, no time |
01:02
🔗
|
SketchCow |
busy genius |
01:02
🔗
|
closure |
tell me something to run, or I'll give you an account |
01:02
🔗
|
SketchCow |
Well, how about this. e-mail me a list. I'll write my code and crap, then hand it to you. |
01:02
🔗
|
closure |
never looked at geodns stuff |
01:02
🔗
|
closure |
sure |
01:02
🔗
|
SketchCow |
We're just doing extremely general, after all |
01:03
🔗
|
|
patricko- is now known as patrickod |
01:03
🔗
|
|
patrickod is now known as patricko- |
01:08
🔗
|
|
patricko- is now known as patrickod |
01:18
🔗
|
trs80 |
is this at a point where you want more testing clients? |
01:20
🔗
|
closure |
yes |
01:27
🔗
|
SketchCow |
The more the better, now. |
01:27
🔗
|
SketchCow |
I don't want people sacrificing anything for it, don't go for unwarranted overuse, but it's worthwhile to get us in the realm of our plans |
01:28
🔗
|
aschmitz |
closure: How big is the current test shard? |
01:28
🔗
|
trs80 |
ok, how do I get started? |
01:29
🔗
|
aschmitz |
Ah, I can read. 2.91 TB. |
01:30
🔗
|
aschmitz |
trs80: http://archiveteam.org/index.php?title=INTERNETARCHIVE.BAK/git-annex_implementation#SHARD1 |
01:32
🔗
|
closure |
yeah, 2.91 tb, but you don't need that much disk, it will use what you give it |
01:32
🔗
|
tpw_rules |
how often should i fsck? |
01:33
🔗
|
tpw_rules |
what does that check? |
01:33
🔗
|
closure |
still need to figure that out.. it checks the file contents |
01:33
🔗
|
tpw_rules |
ah |
01:33
🔗
|
tpw_rules |
so just md5 * |
01:34
🔗
|
aschmitz |
Any problem running on an NFS mount? |
01:36
🔗
|
GitHub175/#internetarchive.bak |
IA.BAK/pubkey d95afb3 Joey Hess: add trs80 |
01:36
🔗
|
GitHub175/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to pubkey: http://git.io/jhni |
01:36
🔗
|
closure |
aschmitz: there can be lock file problems. I would recommend not running multiple concurrent downloads on nfs |
01:36
🔗
|
|
patrickod is now known as patricko- |
01:39
🔗
|
aschmitz |
But with just one copy of git-annex, it should be fine? |
01:40
🔗
|
aschmitz |
Second question is whether there's a way to get git-annex to use a specified keypair, rather than ~/.ssh/id_rsa. |
01:40
🔗
|
closure |
probably. |
01:40
🔗
|
closure |
aschmitz: the current iabak script generates its own dedicated ssh key, and makes it be used. so yes |
01:40
🔗
|
aschmitz |
Ah, fun. |
01:43
🔗
|
|
patricko- is now known as patrickod |
01:44
🔗
|
trs80 |
closure: oops, so let me send you that new key |
01:44
🔗
|
trs80 |
hmm, so you don't need to manually install the latest git-annex, ./iabak does that for you (although it got an i386 version on amd64) |
01:46
🔗
|
GitHub59/#internetarchive.bak |
IA.BAK/pubkey d2f5097 Joey Hess: swap in right key for trs80 |
01:46
🔗
|
GitHub59/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to pubkey: http://git.io/jhWq |
01:47
🔗
|
closure |
yeah, hadn't realized that old documentation was still in the wiki |
01:50
🔗
|
trs80 |
hmm, now it's gone into a sleep for 1 hour |
01:50
🔗
|
|
patrickod is now known as patricko- |
01:52
🔗
|
trs80 |
right, because iabak-helper doesn't run git-annex init in shard1 |
01:53
🔗
|
trs80 |
hmm, but it should have ... |
01:53
🔗
|
trs80 |
ah, because my git wasn't configured with user.name/email |
01:55
🔗
|
closure |
oh, ok |
01:56
🔗
|
closure |
trs80: did that leave the shard1 empty? |
01:57
🔗
|
closure |
sounds like the problem zottelbey had earlier |
01:57
🔗
|
trs80 |
closure: yeah, it did |
01:58
🔗
|
GitHub42/#internetarchive.bak |
IA.BAK/master 717e95e Joey Hess: set user.name and user.email locally to deal with systems where git falls over otherwise... |
01:58
🔗
|
GitHub42/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to master: http://git.io/jh8D |
02:02
🔗
|
trs80 |
also line 107 comparing versions failed a little bit because I ran git-annex init in the root dir, causing repository version lines to be output. maybe add -m1 to the grep "version" |
02:04
🔗
|
trs80 |
little things to work around stupid users :) |
02:04
🔗
|
GitHub125/#internetarchive.bak |
IA.BAK/master 418a7d3 Joey Hess: make version grep look at 1st line |
02:04
🔗
|
GitHub125/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to master: http://git.io/jhBB |
02:05
🔗
|
tpw_rules |
http://cl.ly/image/24082s0z3c1c/Screen%20Shot%202015-04-01%20at%209.04.57%20PM.png EHEHEHHEHE |
02:10
🔗
|
GitHub120/#internetarchive.bak |
IA.BAK/pubkey 1f9922f Joey Hess: add aschmitz |
02:10
🔗
|
GitHub120/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to pubkey: http://git.io/jh0B |
02:11
🔗
|
aschmitz |
Thanks! |
02:20
🔗
|
underscor |
closure: fwiw I still can't ssh |
02:20
🔗
|
|
hatseflat (~hatseflat@[redacted]) has joined #internetarchive.bak |
02:20
🔗
|
underscor |
(at least, the script fails with "you're not signed up yet") |
02:22
🔗
|
underscor |
interestingly manually reconstructing the command its using works fine |
02:22
🔗
|
underscor |
(ssh SHARD1@iabak.archiveteam.org git-annex-shell -c configlist shard1) |
02:22
🔗
|
aschmitz |
underscor: The script uses id_rsa in the local directory, while your manual command uses the one in ~/.ssh. |
02:22
🔗
|
underscor |
aha |
02:24
🔗
|
aschmitz |
Might be best to generate a new key / use the id_rsa.pub that probably got generated, rather than copying your personal id_rsa around to more places, though. |
02:24
🔗
|
underscor |
this was a manually generated one from a previous iteration |
02:25
🔗
|
underscor |
but yeah, good point to consider |
02:25
🔗
|
trs80 |
how long should I expect git-annex --library-path to churn before starting to download? been about 20 minutes so far |
02:26
🔗
|
underscor |
hum |
02:26
🔗
|
underscor |
my dirname doesn't have -z |
02:26
🔗
|
underscor |
weird |
02:27
🔗
|
closure |
trs80: it can take a while on a slower disk.. you could ctrl-c, touch IA.BAK/NOSHUF and avoid the overhead of the shuffling it does |
02:30
🔗
|
aschmitz |
"git annex [...] get -- [item names]" seems to just be hanging for me? There's a "[git] <defunct>" a few processes after it, if that's relevant. |
02:30
🔗
|
aschmitz |
100% CPU, but no network traffic, and strace seems to mostly be it checking the time. |
02:31
🔗
|
tpw_rules |
this is gonna make my disk fragmented as fuck |
02:31
🔗
|
trs80 |
aschmitz: same here |
02:32
🔗
|
tpw_rules |
what happens if you just try git annex get |
02:32
🔗
|
tpw_rules |
that's what i'm doing and it's working great |
02:33
🔗
|
aschmitz |
Hmm. That'll be ordered, though. |
02:33
🔗
|
aschmitz |
Which isn't ideal, but better than busy waiting. |
02:34
🔗
|
tpw_rules |
ordered? |
02:34
🔗
|
aschmitz |
Alphabetical by item name, no? |
02:34
🔗
|
aschmitz |
(The requests) |
02:34
🔗
|
tpw_rules |
oh yeah, i think |
02:34
🔗
|
GitHub17/#internetarchive.bak |
IA.BAK/server 56507ef Joey Hess: sketchcow's geoip extractor |
02:34
🔗
|
GitHub17/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/jh2S |
02:35
🔗
|
closure |
aschmitz: by default it looks at all 100 thousand files, finds ones that don't have enough copies, scrables the list, and downloads at random. this takes a while |
02:35
🔗
|
tpw_rules |
did that update just recently happen? |
02:35
🔗
|
tpw_rules |
because mine is doing it alphabetically |
02:36
🔗
|
tpw_rules |
using like git annex get --not --copies 3 |
02:36
🔗
|
aschmitz |
Well, I think it had picked some to download, as it has a huge command line that looks like a result of that. |
02:39
🔗
|
closure |
hmm idn |
02:40
🔗
|
aschmitz |
Hm, interesting problem. |
02:41
🔗
|
aschmitz |
Looks like some of these items have since been darked. |
02:41
🔗
|
tpw_rules |
there are many in the shard that have been |
02:42
🔗
|
trs80 |
I just killed that process and a new one started, which is now writing stuff |
02:42
🔗
|
tpw_rules |
(though i'm not sure what 'darked' means in IA lingo? is it permanent?) |
02:42
🔗
|
GitHub35/#internetarchive.bak |
IA.BAK/server 1dafefc Joey Hess: perm fixup |
02:42
🔗
|
GitHub35/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/jhVP |
02:42
🔗
|
trs80 |
although that git is now defunct again |
02:42
🔗
|
aschmitz |
"darking" just makes items unavailable to the public, the IA keeps a copy, and could revert that if they wanted to. Doesn't usually happen, though, as far as I know. |
02:43
🔗
|
GitHub177/#internetarchive.bak |
IA.BAK/server efda62a Joey Hess: sketch had a sort -u in there which I forgot |
02:43
🔗
|
GitHub177/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/jhVH |
02:47
🔗
|
trs80 |
closure: so the shuf completes, but the git-annex subprocesses is dying later on |
02:51
🔗
|
closure |
SketchCow: here we are! http://iabak.archiveteam.org/stats/SHARD1.geolist |
02:52
🔗
|
closure |
trs80: hmm, if git cat-file is dying for some reason that must be the problem. I'd like to debug this, but not tonight |
02:52
🔗
|
tpw_rules |
it has my zip code wrong :( |
02:53
🔗
|
sep332 |
it's got my city right - probably because the box is sitting inside the ISP lol |
02:53
🔗
|
aschmitz |
Yay ICBM addresses for everyone. :-/ |
02:54
🔗
|
tpw_rules |
sep332: how do you do these things? what isp |
02:54
🔗
|
garyrh |
Must get IA in Antarctica... |
02:54
🔗
|
sep332 |
oh i cheat! i work there ;) |
02:55
🔗
|
tpw_rules |
oh |
02:55
🔗
|
tpw_rules |
but shit, that lat/lon is only 3 miles from my house |
02:56
🔗
|
SketchCow |
http://iabackup.archiveteam.org/ia.bak/ |
02:56
🔗
|
tpw_rules |
whoever is doing that may want to chop off a couple digits just in case |
02:57
🔗
|
SketchCow |
if you're truly concerned about the lat-long |
02:57
🔗
|
SketchCow |
We can remove it. |
02:57
🔗
|
closure |
SketchCow: hmm, why only 6 clients? |
02:57
🔗
|
SketchCow |
we remove IP already. |
02:57
🔗
|
SketchCow |
closure: Old data |
02:57
🔗
|
SketchCow |
I've been hacking, bro! |
02:57
🔗
|
underscor |
closure: is it expected that the git cat-file sits using a bunch of cpu for a while before downloading starts? |
02:57
🔗
|
aschmitz |
Personally I'd stick with country and region, but I wouldn't fight over it or anything. |
02:58
🔗
|
tpw_rules |
i'm personally not at all. but it's a concern in the community. i'd probably round to one decimal |
02:58
🔗
|
closure |
underscor: yes |
02:58
🔗
|
SketchCow |
closure - kill lat-long |
02:58
🔗
|
SketchCow |
The code is obvious in the script |
02:58
🔗
|
SketchCow |
And my thing doesn't care. |
03:01
🔗
|
SketchCow |
http://iabackup.archiveteam.org/ia.bak/ now upgraded with all 15 clients |
03:01
🔗
|
GitHub151/#internetarchive.bak |
IA.BAK/server 61a093f Joey Hess: de-icbm |
03:01
🔗
|
GitHub151/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/jhKz |
03:03
🔗
|
trs80 |
closure: yeah, git ls-files is what goes defunct |
03:03
🔗
|
closure |
oh, interesting it's ls-files |
03:04
🔗
|
trs80 |
right now it's stuck in write(1, |
03:04
🔗
|
closure |
kinda suggests it's due to all those files being shoved through the command line and to ls-files |
03:04
🔗
|
trs80 |
so the destination pipe is full I guess |
03:04
🔗
|
closure |
ls-files is stuck in write? |
03:05
🔗
|
trs80 |
yeah |
03:05
🔗
|
SketchCow |
Now the fun part, more nerdy than anything. |
03:05
🔗
|
SketchCow |
I want to add a second visual chart. |
03:05
🔗
|
SketchCow |
I mean a second graphic chart. Now to understand how to make the api not blow up. |
03:05
🔗
|
tpw_rules |
to do what |
03:05
🔗
|
SketchCow |
just US. |
03:06
🔗
|
SketchCow |
Because it's important to know how far from IA ground zero they are. |
03:06
🔗
|
tpw_rules |
oh |
03:06
🔗
|
tpw_rules |
where is IA ground zero? |
03:06
🔗
|
SketchCow |
San Francisco. |
03:06
🔗
|
tpw_rules |
ah |
03:06
🔗
|
SketchCow |
We have one person in Walnut Creek |
03:06
🔗
|
tpw_rules |
also why can't you just mail tapes to maine or something |
03:06
🔗
|
SketchCow |
Fuck that guy, bomb's going to get him too |
03:06
🔗
|
SketchCow |
that's a different solution path |
03:06
🔗
|
SketchCow |
We can build a really nice off-road car, AND work on our sailboat |
03:07
🔗
|
SketchCow |
AND our drone army |
03:07
🔗
|
tpw_rules |
yes |
03:07
🔗
|
tpw_rules |
where is amazon headquarters |
03:07
🔗
|
SketchCow |
Seattle |
03:07
🔗
|
tpw_rules |
oh |
03:07
🔗
|
SketchCow |
But bear in mind, their shit is EVERYWHERE |
03:07
🔗
|
tpw_rules |
i was gonna say "free IA tape with every drone delivery" |
03:07
🔗
|
tpw_rules |
beam gps at them so they come to your facility, attach a tape, then let them go |
03:09
🔗
|
tpw_rules |
do you think you'll be able to accelerate backup faster than the archive adds new crap? |
03:11
🔗
|
SketchCow |
How did you stumble into this project? |
03:11
🔗
|
SketchCow |
My tweet? |
03:11
🔗
|
tpw_rules |
a tweet from @textfiles |
03:11
🔗
|
SketchCow |
Holding off on USian graph - just because we had some nice advancement today, don't need to hack to 6am |
03:11
🔗
|
aschmitz |
tpw_rules: SketchCow = @textfiles |
03:11
🔗
|
tpw_rules |
ah. so yes |
03:12
🔗
|
SketchCow |
I'll add US later. |
03:13
🔗
|
SketchCow |
So, tpw_rules - I could say a very long thing, or I could say "the mountain must be climbed". |
03:13
🔗
|
SketchCow |
Is "the mountain must be climbed" sufficient or do you want the long thing. |
03:13
🔗
|
tpw_rules |
that's good enough |
03:13
🔗
|
tpw_rules |
i'll do my part |
03:14
🔗
|
SketchCow |
the project is forcing a mass of assessment of the archive |
03:14
🔗
|
tpw_rules |
also particularly in the US good fucking luck finding people without capped internet |
03:14
🔗
|
tpw_rules |
i have to pay a ridiculous amount extra for no cap |
03:14
🔗
|
SketchCow |
Which was desperately needed. |
03:14
🔗
|
tpw_rules |
how is it protected internally? do you do tape or something in the HQ? |
03:15
🔗
|
SketchCow |
Everything with little exception is on spinning disks |
03:15
🔗
|
tpw_rules |
are they all spinning at once? |
03:15
🔗
|
tpw_rules |
i assume you can recover from a failed drive for example. but what about accidentally rm -rf? |
03:16
🔗
|
tpw_rules |
also btw textfiles.whatever is real neat |
03:17
🔗
|
tpw_rules |
i have to confess to being a youngin, so i was never around for that. but it's cool to read about |
03:17
🔗
|
SketchCow |
textfiles.whatever has always been proud of bringing history to the youngins |
03:18
🔗
|
SketchCow |
unless you make a bomb, and then we know you're not a smart youngins and we let evolution sort that out |
03:18
🔗
|
tpw_rules |
i knew about it but never really immersed myself in it. i'm at least fluent in 6502 assembly language though, but not the culture |
03:18
🔗
|
SketchCow |
http://iabackup.archiveteam.org/ia.bak/ now lists the countries because I got tired of counting. |
03:18
🔗
|
SketchCow |
Or counts, anyway |
03:19
🔗
|
tpw_rules |
can you put a size over the tree view at the top? |
03:19
🔗
|
SketchCow |
I did. 2.91 terabytes. |
03:19
🔗
|
tpw_rules |
no i mean for each box |
03:20
🔗
|
SketchCow |
Not right now, no. |
03:20
🔗
|
tpw_rules |
ie 1.7TB is not redundant at all |
03:20
🔗
|
SketchCow |
use the areaaaaaa |
03:20
🔗
|
SketchCow |
that's what it's forrrrr |
03:20
🔗
|
|
tpw_rules gets out ruler |
03:20
🔗
|
tpw_rules |
also textfiles.com* |
03:20
🔗
|
SketchCow |
Also http://textfil.es/ |
03:21
🔗
|
SketchCow |
for those pesky blockers |
03:21
🔗
|
yipdw |
how is there not a .whatever gTLD at this point |
03:21
🔗
|
tpw_rules |
lol. i need to try it at school |
03:21
🔗
|
tpw_rules |
(though i always run with a vpn) |
03:24
🔗
|
tpw_rules |
can you guys remove a repo from the list? i deleted everything from mine because it was being funky and now i show up twice. 1d92bde5-54d3-41bc-932e-d8e8e7bfff51 is my real one and ff2f752d-b35a-4555-b8b4-617f23e4e015 is bad |
03:26
🔗
|
|
trs80 touches NOSHUF and starts again |
03:26
🔗
|
trs80 |
ahh, sweet downloads |
03:29
🔗
|
closure |
trs80: I reproduced the problem.. so I'll probably be able to fix it |
03:29
🔗
|
trs80 |
closure: ah, cool. was going to say I'm in UTC+8 if you wanted to look at it another time |
03:29
🔗
|
closure |
it seems it prints out an enormous list of directory names before stalling? |
03:30
🔗
|
tpw_rules |
okay it's sleepy time for me. closure can you delete that extra repo? |
03:30
🔗
|
closure |
tpw_rules: we could, but let's not worry about it. We want to automatically detect dead repos and disregard them |
03:30
🔗
|
tpw_rules |
ok. i just noticed it with annex info |
03:31
🔗
|
tpw_rules |
but goodnight. got 500gb so far |
03:31
🔗
|
trs80 |
closure: yeah, that sounds like what's happening |
03:32
🔗
|
closure |
that is seriously weird. it's like it thinks that's all one file |
03:32
🔗
|
closure |
I think I'll just make it run git-annex once per dir for now, and debug this tomorrow |
03:34
🔗
|
GitHub127/#internetarchive.bak |
IA.BAK/master b3e2de7 Joey Hess: temporary workaround for strange hang-of-doom when git-annex is given a really, really big list of dirs to get |
03:34
🔗
|
GitHub127/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to master: http://git.io/jhDX |
03:43
🔗
|
trs80 |
closure: the workaround wfm, although I've still got a defunct git process (not sure what type) |
03:43
🔗
|
closure |
*** SO, if iabak is stuck not doing anything and eating cpu, now's a good time to restart it *** |
03:43
🔗
|
trs80 |
cat-file and wget are fine though |
03:43
🔗
|
underscor |
remote: error: hook declined to update refs/heads/synced/master |
03:43
🔗
|
underscor |
:( |
03:43
🔗
|
closure |
underscor: intentional, you're not supposed to be changing the master branch |
03:44
🔗
|
underscor |
oh |
03:44
🔗
|
underscor |
did I do something wrong? |
03:44
🔗
|
underscor |
haha |
03:44
🔗
|
underscor |
how do I check? |
03:44
🔗
|
closure |
git log master |
03:45
🔗
|
underscor |
aha |
03:45
🔗
|
underscor |
wonder how that git commit happened, weird |
03:45
🔗
|
closure |
and then you'll want to git reset --hard HEAD^ or so :) |
03:45
🔗
|
closure |
but, please carry on trying to break it ;) |
03:45
🔗
|
closure |
just don't break it by doing horrible commits in the git-annex branch, that is not checked yet at all |
03:46
🔗
|
underscor |
ok |
03:46
🔗
|
SketchCow |
OH LOOK WHAT BROUGHT BACK UNDERSCOR |
03:46
🔗
|
underscor |
:D |
03:47
🔗
|
yipdw |
wow |
03:48
🔗
|
underscor |
man my shard is still really broken |
03:49
🔗
|
underscor |
closure: I reverted the commit and did the reset --hard |
03:49
🔗
|
underscor |
but it's still trying to commit master on annex sync |
03:49
🔗
|
closure |
underscor: you probably have a synced/master branch lying around with the bad commit in it, which you'd need to delete |
03:50
🔗
|
underscor |
closure: is delete different than revert in this context? |
03:50
🔗
|
closure |
oh, but commit master .. idk, why it would have something to commit |
03:51
🔗
|
underscor |
http://p.defau.lt/?rj4PyeY9VmzbB5LnHYw2qQ |
03:51
🔗
|
closure |
yeah, git branch --delete synced/master |
03:52
🔗
|
underscor |
closure: and now, http://p.defau.lt/?0gvXoSd272FIja7elUcgBQ |
03:53
🔗
|
closure |
you need to delete synced/master and reset master, both |
03:54
🔗
|
underscor |
yay! |
04:00
🔗
|
closure |
trs80: ok, figured it out. It's simply an exponential blowup due to some fancy stuff it tries to do with the command line. Plus possibly a little bit of truncation |
04:01
🔗
|
SketchCow |
Improvement Continues! |
04:02
🔗
|
SketchCow |
Eventually, I will turn the graph page into an ad to help with the experiment. |
04:12
🔗
|
|
espes__ (~espes@[redacted]) has joined #internetarchive.bak |
04:17
🔗
|
closure |
sweet, sped that up by like 1000x |
04:36
🔗
|
|
zottelbey (~zottelbey@[redacted]) has joined #internetarchive.bak |
04:40
🔗
|
GitHub152/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/jhN2 |
04:40
🔗
|
GitHub152/#internetarchive.bak |
IA.BAK/server 756666f Joey Hess: grep the compressed auth.log too, to get a full month of IPs |
05:04
🔗
|
|
zottelbey has quit (Remote host closed the connection) |
06:14
🔗
|
GitHub59/#internetarchive.bak |
IA.BAK/server 930b511 Joey Hess: gc repo too |
06:14
🔗
|
GitHub59/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/jjY8 |
06:16
🔗
|
GitHub154/#internetarchive.bak |
IA.BAK/master 30f5611 Joey Hess: remove debug output |
06:16
🔗
|
GitHub154/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to master: http://git.io/jjYF |
07:05
🔗
|
|
bzc6p_ (~bzc6p@[redacted]) has joined #internetarchive.bak |
07:10
🔗
|
trs80 |
right, 10 iabaks running, should be done in just over a day |
07:11
🔗
|
|
bzc6p has quit (Ping timeout: 600 seconds) |
07:52
🔗
|
GitHub182/#internetarchive.bak |
IA.BAK/pubkey 0e91e20 Daniel Brooks: another for me |
07:52
🔗
|
GitHub182/#internetarchive.bak |
[IA.BAK] db48x pushed 1 new commit to pubkey: http://git.io/jjwL |
08:11
🔗
|
|
londoncal (~londoncal@[redacted]) has joined #internetarchive.bak |
08:48
🔗
|
|
londoncal has quit (Quit: Leaving...) |
08:52
🔗
|
|
bzc6p_ has quit (Read error: Connection reset by peer) |
08:53
🔗
|
|
bzc6p (~bzc6p@[redacted]) has joined #internetarchive.bak |
09:12
🔗
|
|
edsu (~edsu@[redacted]) has joined #internetarchive.bak |
09:23
🔗
|
midas |
hm? you can start multiple jobs on 1 box? |
09:27
🔗
|
db48x |
yea, git annex commands very carefully avoid stepping on their own toes |
09:28
🔗
|
db48x |
you can run 'git annex get' as many times in parrallel as you want |
09:28
🔗
|
db48x |
easiest way to do that is to run iabak multiple times |
09:29
🔗
|
db48x |
and then while they're running you can run git annex get manually to pull down a specific item that you're interested in |
09:37
🔗
|
bzc6p |
midas: Except for network filesystems. |
09:39
🔗
|
midas |
good point :p |
11:34
🔗
|
hater |
db48x: i think gnu/parallel (http://www.gnu.org/software/parallel/ ) would be nice to be build into the helper-script (as some kind of option) |
11:57
🔗
|
GitHub79/#internetarchive.bak |
[IA.BAK] zottelbeyer opened pull request #9: correct zottelbeyer's pubkey (pubkey...patch-1) http://git.io/veebN |
11:59
🔗
|
GitHub75/#internetarchive.bak |
IA.BAK/pubkey 09a303b Daniel Brooks: Merge pull request #9 from zottelbeyer/patch-1... |
11:59
🔗
|
GitHub75/#internetarchive.bak |
IA.BAK/pubkey 6bb10d9 zottelbeyer: correct zottelbeyer's pubkey... |
11:59
🔗
|
GitHub169/#internetarchive.bak |
[IA.BAK] db48x closed pull request #9: correct zottelbeyer's pubkey (pubkey...patch-1) http://git.io/veebN |
11:59
🔗
|
GitHub75/#internetarchive.bak |
[IA.BAK] db48x pushed 2 new commits to pubkey: http://git.io/veeN1 |
12:00
🔗
|
db48x |
hater: possibly. it might be easier to build support for concurrent downloads into git annex itself |
12:16
🔗
|
hater |
i am too lazy to lern haskell to programm a tool which already exists |
12:17
🔗
|
hater |
https://git-annex.branchable.com/todo/parallel_get/ <-- Posted 3 months and 14 days ago |
14:43
🔗
|
sep332 |
i switched over to using the iabak script and i'm getting an error |
14:43
🔗
|
sep332 |
error: Untracked working tree file 'internetarchivebooks/100storyofpatrio00sinc/100storyofpatrio00sinc_archive.torrent' would be overwritten by merge. |
14:43
🔗
|
sep332 |
can I just delete the file and try again? |
14:53
🔗
|
closure |
that's weird.. is your repository in direct mode maybe? |
14:53
🔗
|
closure |
git annex info --fast |
14:54
🔗
|
closure |
1st line |
14:54
🔗
|
sep332 |
i just did a "git clone" and then copied the files to the shard1 folder |
14:54
🔗
|
sep332 |
"indirect" |
14:54
🔗
|
closure |
oh, hm, so you switched over by copying files? |
14:55
🔗
|
sep332 |
yeah |
14:55
🔗
|
closure |
I hope you copied .git/annex that's where the actual downloads are |
14:55
🔗
|
closure |
but really, the right way is to just move your old git repo to IA.BAK/shard1 |
14:55
🔗
|
sep332 |
ok. i didn't realize about .git |
14:56
🔗
|
closure |
suggest you move the files back to the old repo, delete the new repo, and move the old repo |
15:04
🔗
|
sep332 |
ok, it's working fine. thanks closure |
15:14
🔗
|
closure |
wow, we're over 50% on SHARD1 |
15:14
🔗
|
closure |
er, no. Over 25% :) |
15:14
🔗
|
sep332 |
well... i have 1.9TB on this drive now |
15:15
🔗
|
sep332 |
shouldn't that be higher then? |
15:15
🔗
|
closure |
maybe.. could be your client has not communicated back, if you just started running the script |
15:15
🔗
|
closure |
did you ever git annex sync manually before? |
15:15
🔗
|
closure |
script does it once an hour |
15:16
🔗
|
sep332 |
yeah, i stole that snippet that runs sync every hour |
15:16
🔗
|
sep332 |
and i ran it manually twice in the last hour |
15:16
🔗
|
closure |
well, at 3 copies, SHARD1 needs 9 tb |
15:17
🔗
|
SketchCow |
Right. |
15:18
🔗
|
closure |
of course, the graph is counting by files not by size anyway. So somewhat comparing apples and oranges |
15:18
🔗
|
SketchCow |
Shhh |
15:18
🔗
|
SketchCow |
Don't wreck my dreams |
15:19
🔗
|
SketchCow |
I agree, size is ideal. |
15:19
🔗
|
SketchCow |
But I like incrementing, after all |
15:20
🔗
|
SketchCow |
http://blog.dshr.org/2015/03/the-opposite-of-lockss.html |
15:20
🔗
|
SketchCow |
(My comment at end) |
15:24
🔗
|
SketchCow |
closure: If you create output files of data updated regularly about the activity, I'll make pretty graphs that display them. |
15:25
🔗
|
closure |
SketchCow: how about a connecting clients per hour graph? |
15:32
🔗
|
SketchCow |
I'm for any textfiles you want to generate. |
15:32
🔗
|
SketchCow |
I'm not doing particularly smart graphing, so I'm converting files into graphs |
16:13
🔗
|
hater |
warning: the iabak-helper script is broken atm: someone changed the output of 'git-annex version' - i pulled an bugfix but it is not merged into the master atm |
16:21
🔗
|
hater |
here is the bugfix: https://github.com/cancerAlot/IA.BAK/commit/6c432e4808ebad9cbcb33902535a575d6b687f0e |
16:55
🔗
|
SketchCow |
closure: How hard is it for me to give you a collection and have you go "aaaaand here's the stats on that." |
16:56
🔗
|
SketchCow |
i.e. how big it is (originals and system files, number of items) |
17:01
🔗
|
|
svchfoo1 has quit (Quit: Closing) |
17:06
🔗
|
|
svchfoo1 (~chfoo1@[redacted]) has joined #internetarchive.bak |
17:09
🔗
|
|
svchfoo2 gives channel operator status to svchfoo1 |
17:14
🔗
|
|
zottelbey (~zottelbey@[redacted]) has joined #internetarchive.bak |
17:20
🔗
|
zottelbey |
alright, its working now! though the speed is somewhat terrible. |
17:22
🔗
|
hater |
db48x: 'one tool for one thing' - does implementing parallel-support into git-annex violate that 'rule'? |
18:58
🔗
|
|
patricko- is now known as patrickod |
19:03
🔗
|
GitHub0/#internetarchive.bak |
[IA.BAK] db48x created git-annex from synced/git-annex (+0 new commits): http://git.io/veU3j |
19:03
🔗
|
GitHub188/#internetarchive.bak |
[IA.BAK] db48x created synced/git-annex from git-annex (+0 new commits): http://git.io/veU3p |
19:03
🔗
|
GitHub169/#internetarchive.bak |
[IA.BAK] db48x created synced/master from master (+0 new commits): http://git.io/veU3h |
19:04
🔗
|
GitHub97/#internetarchive.bak |
IA.BAK/server 2bad8e6 Joey Hess: add a client connections per hour data file |
19:04
🔗
|
GitHub97/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/veUsE |
19:09
🔗
|
GitHub62/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/veUZH |
19:09
🔗
|
GitHub62/#internetarchive.bak |
IA.BAK/server 14ffc71 Joey Hess: typo |
19:11
🔗
|
closure |
hater: current version code has: git-annex version | grep "version" -m 1 |
19:11
🔗
|
closure |
which seems to work ok... |
19:13
🔗
|
|
closure goes and adds a git annex version --raw anyway |
19:14
🔗
|
GitHub73/#internetarchive.bak |
IA.BAK/server 0ac68d4 Joey Hess: typo2 |
19:14
🔗
|
GitHub73/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/veUCi |
19:17
🔗
|
closure |
SketchCow: I can ingest a collection into a new shard pretty quickly, and then can do anything we can do with SHARD1 |
19:18
🔗
|
closure |
by pretty quickly, 10 minutes or so |
19:18
🔗
|
SketchCow |
Which is great. |
19:18
🔗
|
SketchCow |
Mostly, I just wanted the ability for you to look at a collection and go "it's this big" |
19:19
🔗
|
closure |
I've only done that for the number of files, which is all I care about, not disk size. The data is available in the census though |
19:20
🔗
|
SketchCow |
Anyway, I have our next collection, I think. |
19:20
🔗
|
SketchCow |
usfederalcourts |
19:21
🔗
|
SketchCow |
and genealogy |
19:21
🔗
|
SketchCow |
But obviously I think we should be at a SOLID 4 for current shard before we add more shards. |
19:21
🔗
|
SketchCow |
And we have some shard-punching to do, etc. |
19:23
🔗
|
closure |
are we going to 3 or to 4? |
19:23
🔗
|
closure |
and by 4 I mean, 4 including IA |
19:24
🔗
|
closure |
it's a big decision |
19:24
🔗
|
sep332 |
aw, what's 14PB between friends |
19:25
🔗
|
closure |
... times 1770 |
19:25
🔗
|
closure |
er, you already multiplied, didn't yu |
19:25
🔗
|
closure |
numbers too big |
19:25
🔗
|
closure |
SketchCow: so here is a new textfile for you.. http://iabak.archiveteam.org/stats/SHARD1.clientconnsperhour |
19:26
🔗
|
closure |
that is the number of clients that connected for that shard, per hour. |
19:26
🔗
|
SketchCow |
client connections |
19:26
🔗
|
SketchCow |
Since people seem to be uber-connecting |
19:26
🔗
|
closure |
the guys that are running concurrent iabak scripts count multiple |
19:26
🔗
|
sep332 |
it would take 3 days for my computer to count that high |
19:26
🔗
|
closure |
call it "worker threads" or something |
19:27
🔗
|
closure |
it would make a nice bar graph |
19:27
🔗
|
SketchCow |
closure: http://iabackup.archiveteam.org/ia.bak/ |
19:27
🔗
|
SketchCow |
I'm assuming we're working to make it 100% Dark Green |
19:27
🔗
|
closure |
that's how it's set right now, yes |
19:27
🔗
|
SketchCow |
(the area graph. Making the map 100% dark green will take longer, muhahaha) |
19:28
🔗
|
closure |
13124639 usfederalcourts.list |
19:28
🔗
|
SketchCow |
closure: Well, that's what I'm shooting for. |
19:28
🔗
|
SketchCow |
Yes, usfederalcourts will be 1.3tb |
19:28
🔗
|
closure |
13 million.. so, that's 130 shards. They may be smaler than usual disk size, I dunno |
19:28
🔗
|
closure |
that's file count |
19:28
🔗
|
SketchCow |
Ah. |
19:29
🔗
|
SketchCow |
Well, anyway, point is that I always assumed "4" (3+IA). Everything green |
19:29
🔗
|
closure |
oh, ok. I pulled COPIES=4 in iabak from /dev/ass |
19:30
🔗
|
SketchCow |
My documentation and writing mentions it |
19:30
🔗
|
SketchCow |
I bet you got it there |
19:30
🔗
|
SketchCow |
These "sectors" are then checked into the virtual drive, and based on if there's zero, one, two or more than two copies of the item in "The Drive", a color-coding is assigned. (Red, Yellow, Green). |
19:30
🔗
|
|
bzc6p has quit (Read error: Operation timed out) |
19:31
🔗
|
SketchCow |
I just stole that idea from Josh S., creator of Delicious and most bitter Google Employee ever |
19:31
🔗
|
SketchCow |
Who told me GMail works on "5 copies of mail, in 3 discrete geographical locations, at all time" |
19:32
🔗
|
closure |
5 is my bare minimum replication for important personal data. and yeah, 3 locations |
19:32
🔗
|
SketchCow |
See? So we both agree |
19:32
🔗
|
SketchCow |
IA + 3 |
19:32
🔗
|
SketchCow |
(IA is two) |
19:32
🔗
|
SketchCow |
(Sort of) |
19:32
🔗
|
SketchCow |
(Let's pretend I said it was) |
19:32
🔗
|
|
bzc6p (~bzc6p@[redacted]) has joined #internetarchive.bak |
19:32
🔗
|
SketchCow |
I mean, it's definitely two copies, but occasionally two copies end up in the same building. |
19:33
🔗
|
closure |
89 genealogy.list |
19:33
🔗
|
closure |
heh! well, we can fit that in somewhere |
19:34
🔗
|
closure |
I wonder if that's really all of it. There is a weirdness in the census where an item can be in multiple collections, and the data I'm working from just picked the first one |
19:34
🔗
|
SketchCow |
I think it's not. |
19:34
🔗
|
SketchCow |
It's huge. |
19:35
🔗
|
db48x |
oh, I ran git annex sync in IA.BAK, not IA.BAK/shard1 |
19:35
🔗
|
db48x |
that's confusing |
19:36
🔗
|
db48x |
hater: yes and no |
19:41
🔗
|
db48x |
it'd be nice if we could always just use parallel (or any of a dozen alternatives), but there are a couple of problems with it |
19:42
🔗
|
db48x |
interleaving the output of a bunch of git annex get commands is super annoying |
19:43
🔗
|
db48x |
the number of jobs to run simultaneously is not obvious; what we really care about is how much bandwidth we're using |
19:43
🔗
|
db48x |
some people want to use a lot, some people want to throttle it, some people want to do both at different times, or in different circumstances |
19:43
🔗
|
closure |
yeah, a get that started/stalled to saturate would be great |
19:44
🔗
|
midas |
this works alot better with 10 gets |
19:45
🔗
|
|
patrickod is now known as patricko- |
19:46
🔗
|
closure |
"collection":["1880_census","microfilm","americana","us_census","genealogy","additional_collections"] |
19:47
🔗
|
db48x |
some have a cap and don't care about throughput, but only the total data uploaded/downloaded |
19:47
🔗
|
closure |
seeing a lot of that kind of thing.. that's presumably why genealogy has so few items, they all went to other more specific things |
19:48
🔗
|
closure |
35518 1880_census |
19:48
🔗
|
|
SN4T14_ (~SN4T14@[redacted]) has joined #internetarchive.bak |
19:49
🔗
|
closure |
57995 jstor_virglawregi |
19:51
🔗
|
|
closure wonders if we can go to 200 thousand or so per shard. Have not noticed much scaling issues with 100k files. Except for that startup delay for shuffling.. |
19:54
🔗
|
closure |
103554 nasa_techdocs .. that would be a nice shard |
19:54
🔗
|
db48x |
closure: this is a side issue, but I just noticed that every single iabak-helper I've ever run is still waiting around to do a sync every hour |
19:55
🔗
|
closure |
because they bg? |
19:55
🔗
|
db48x |
yep |
19:55
🔗
|
closure |
perhaps it should fork off a single syncer if none is running |
19:56
🔗
|
closure |
let's see, what lock file program is portably available..? |
19:56
🔗
|
closure |
I'm thinking maybe perl |
19:57
🔗
|
|
SN4T14 has quit (Ping timeout: 512 seconds) |
19:57
🔗
|
db48x |
doesn't the assistant already do that? |
19:57
🔗
|
midas |
you can try the ftp boneyard, it's big and has huge files |
19:59
🔗
|
closure |
it does.. does some other stuff we maybe don't want |
19:59
🔗
|
closure |
oh, util-linux has flock(1) now |
20:05
🔗
|
GitHub120/#internetarchive.bak |
IA.BAK/master 2c3a13d Joey Hess: use separate program for hourly background sync, and use lock file so only 1 runs |
20:05
🔗
|
GitHub135/#internetarchive.bak |
IA.BAK/server c061607 Joey Hess: this script seems to have bit rotted since I last ran it |
20:05
🔗
|
GitHub120/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to master: http://git.io/veUDM |
20:05
🔗
|
GitHub135/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/veUD1 |
20:10
🔗
|
closure |
here's a thought. SHARD2 could be made by taking the *smallest* collections, until we get to 100k files. |
20:11
🔗
|
closure |
that turns out to be 3537 collections. |
20:11
🔗
|
closure |
with the larger collections in it being ones like TheIncredibleSandwich, TheFivePercent, KingsInDisguise, HOLLER_band |
20:11
🔗
|
closure |
public_library_of_science, usda-agriculturalhistoryseries |
20:14
🔗
|
|
patricko- is now known as patrickod |
20:22
🔗
|
|
patrickod is now known as patricko- |
20:24
🔗
|
hater |
closure: "grep "version" -m 1" - that "-m 1"-part was not in the sourcecode when i wrote the patch |
20:39
🔗
|
closure |
hater: so, it's ok now? |
20:41
🔗
|
hater |
yes |
20:48
🔗
|
closure |
http://iabak.archiveteam.org/candidateshards/ |
20:48
🔗
|
closure |
so, that's some lists of collections, starting with the ones with less files. Most of the lists are 100k files |
20:49
🔗
|
closure |
around 100-150 there are some interesting sets of collections |
20:50
🔗
|
closure |
http://iabak.archiveteam.org/candidateshards/smallestfirst118.lst I like this one |
20:50
🔗
|
closure |
http://iabak.archiveteam.org/candidateshards/smallestfirst118.lst |
20:50
🔗
|
closure |
oop |
20:50
🔗
|
closure |
has: NISTJournalofResearch, 1880_census, speedydeletionwiki |
20:51
🔗
|
closure |
http://iabak.archiveteam.org/candidateshards/smallestfirst107.lst archiveteam + glennbeck + some jstor |
20:54
🔗
|
closure |
http://iabak.archiveteam.org/candidateshards/smallestfirst10.lst nice grab bag |
20:55
🔗
|
GitHub14/#internetarchive.bak |
IA.BAK/server b39791a Joey Hess: add simple shard packer... |
20:55
🔗
|
GitHub14/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/veTLk |
21:15
🔗
|
GitHub1/#internetarchive.bak |
[IA.BAK] cancerAlot closed pull request #8: install the right arch of git-annex (master...master) http://git.io/jpBr |
21:16
🔗
|
closure |
SketchCow: another line for your graph.. http://iabak.archiveteam.org/stats/SHARD1.filestransferred |
21:16
🔗
|
closure |
this will get 1 line added per hour, with the timestamp, and the total number of files transferred so far. |
21:17
🔗
|
GitHub57/#internetarchive.bak |
[IA.BAK] joeyh pushed 1 new commit to server: http://git.io/veTnP |
21:17
🔗
|
GitHub57/#internetarchive.bak |
IA.BAK/server f5f8a53 Joey Hess: add filestransferred data file |
21:17
🔗
|
GitHub32/#internetarchive.bak |
[IA.BAK] cancerAlot opened pull request #10: checks if "reserve" is less than the available space (master...master) http://git.io/veTnH |
21:18
🔗
|
hater |
closure: https://github.com/ArchiveTeam/IA.BAK/pull/10 closes the issue: https://github.com/ArchiveTeam/IA.BAK/issues/7 |
21:22
🔗
|
GitHub108/#internetarchive.bak |
IA.BAK/master 13e1b64 cancerAlot: Merge pull request #1 from ArchiveTeam/master... |
21:22
🔗
|
GitHub108/#internetarchive.bak |
IA.BAK/master 6c432e4 cancerAlot: bugfix because the 'git-annex version' output was changed |
21:22
🔗
|
GitHub108/#internetarchive.bak |
IA.BAK/master 7e2cf45 cancerAlot: . |
21:22
🔗
|
GitHub9/#internetarchive.bak |
[IA.BAK] joeyh closed pull request #10: checks if "reserve" is less than the available space (master...master) http://git.io/veTnH |
21:22
🔗
|
GitHub108/#internetarchive.bak |
[IA.BAK] joeyh pushed 5 new commits to master: http://git.io/veTCQ |
21:38
🔗
|
|
svchfoo2 has quit (Ping timeout: 240 seconds) |
21:39
🔗
|
|
svchfoo2 (~chfoo2@[redacted]) has joined #internetarchive.bak |
21:39
🔗
|
|
svchfoo1 gives channel operator status to svchfoo2 |
21:53
🔗
|
|
patricko- is now known as patrickod |
21:55
🔗
|
|
patrickod is now known as patricko- |
22:12
🔗
|
hater |
who is able to add my ssh public key? |
22:14
🔗
|
zottelbey |
could you implement a thread option in the script ? i now have to run 4 tty to reach 1.2-2MiB/s. i would like to max out my 7MB/s without opening another 15. |
22:15
🔗
|
hater |
zottelbey: parallel downloading is in progress |
22:15
🔗
|
zottelbey |
neat. |
22:15
🔗
|
hater |
https://git-annex.branchable.com/todo/parallel_get/ |
22:15
🔗
|
zottelbey |
hater, also for pub key anyone with write access to the git rep. |
22:16
🔗
|
zottelbey |
i dont care about the output tbh i just want it to be faster. |
22:17
🔗
|
zottelbey |
iabak could just run n copies of git-annex. |
22:19
🔗
|
zottelbey |
"Last edited 3 months and 15 days ago" tells me git-annex is not going to get there any time soon probably. |
22:24
🔗
|
yipdw |
the author of git-annex is present and has been making changes to better suit it for this project |
22:24
🔗
|
yipdw |
no need to be an ass |
22:25
🔗
|
db48x |
hater: I can add your key |
22:25
🔗
|
zottelbey |
yipdw, sorry, didnt mean to offend anyone. |
22:25
🔗
|
yipdw |
np, I probably read too far into it |
22:25
🔗
|
zottelbey |
probably. |
22:26
🔗
|
hater |
db48x: i sent the link in the query |
22:26
🔗
|
Kazzy |
hm, is it possible to make the script store the "backup data" in a different location than the iabak scripts etc? Having issues, script seems to create symlinks, which doesn't play well with smb/cifs shares |
22:27
🔗
|
GitHub37/#internetarchive.bak |
IA.BAK/pubkey 9886116 Daniel Brooks: add hater's public key |
22:27
🔗
|
GitHub37/#internetarchive.bak |
[IA.BAK] db48x pushed 1 new commit to pubkey: http://git.io/veTMP |
22:27
🔗
|
hater |
db48x: thx |
22:27
🔗
|
db48x |
yw |
22:29
🔗
|
db48x |
Kazzy: it's currently not an option, but you can edit iabak-helper to change the location |
22:30
🔗
|
db48x |
Kazzy: you could also go into the shard1 directory and run 'git annex direct' to switch to direct mode, which doesn't use symlinks |
22:32
🔗
|
Kazzy |
hm, i'll take a shot at changing paths in the iabak-helper script, see if it'll work that way, cheers |
22:33
🔗
|
hater |
Kazzy: if something useful comes out, push it into the repo ;) |
22:34
🔗
|
Kazzy |
well, at first it'll just be changing the hardcoded paths, I guess, will see where it goes from there |
22:47
🔗
|
GitHub64/#internetarchive.bak |
[IA.BAK] kurtmclester opened pull request #11: Changed key. -Kazzy (pubkey...pubkey) http://git.io/veTHj |
22:50
🔗
|
GitHub83/#internetarchive.bak |
IA.BAK/pubkey 6a6a11d Kurt: Changed key. -Kazzy |
22:50
🔗
|
GitHub83/#internetarchive.bak |
IA.BAK/pubkey ccb9ea4 Daniel Brooks: Merge pull request #11 from kurtmclester/pubkey... |
22:50
🔗
|
GitHub123/#internetarchive.bak |
[IA.BAK] db48x closed pull request #11: Changed key. -Kazzy (pubkey...pubkey) http://git.io/veTHj |
22:50
🔗
|
GitHub83/#internetarchive.bak |
[IA.BAK] db48x pushed 2 new commits to pubkey: http://git.io/veTQK |
22:52
🔗
|
Kazzy |
git-annex-shell: user error (git ["config","--null","--list"] exited 126) |
22:53
🔗
|
db48x |
can you show the output from prior to that? |
22:53
🔗
|
Kazzy |
Checking ssh to server... |
22:53
🔗
|
Kazzy |
only bit before that is: Hit Enter once you're signed up! |
22:53
🔗
|
Kazzy |
then throws that error, and asks me to sign up for access again |
22:54
🔗
|
closure |
Kazzy: my guess is you've mangled the path to the git repo on the server |
22:55
🔗
|
closure |
git remote add origin "$user:$dir" |
22:55
🔗
|
closure |
since that uses $dir, if you changed it, it'll look in the wrong place on the server |
22:56
🔗
|
Kazzy |
oh right hm, yeah didn't notice all the $dir references in there, will try some more poking |
22:59
🔗
|
db48x |
all those dir variables should probably stay as-is |
23:04
🔗
|
db48x |
just change to a different directory before the if on line 126 |
23:05
🔗
|
closure |
oho, stats have been broken today! |
23:06
🔗
|
closure |
seems I have a stupid permissions error |
23:09
🔗
|
closure |
omg |
23:10
🔗
|
closure |
+1 and +2 have *both* overtaken +0 in the stats! |
23:10
🔗
|
closure |
numcopies +0: 17275 |
23:10
🔗
|
closure |
numcopies +1: 42961 |
23:10
🔗
|
closure |
numcopies +2: 32420 |
23:10
🔗
|
closure |
numcopies +3: 10149 |
23:10
🔗
|
closure |
numcopies +4: 490 |
23:10
🔗
|
|
closure wonders if SketchCow's script wil handle this, I forgot it sorted it like that |
23:11
🔗
|
closure |
at 2 am yesterday, we had numcopies +0: 54519 .. |
23:14
🔗
|
closure |
we've doubled the total files transferred today |
23:20
🔗
|
db48x |
Kazzy: https://github.com/db48x/IA.BAK/commit/a320bbbf0abd1359c0b20fbe7f412864437fa357 |
23:21
🔗
|
Kazzy |
oh ncie, thanks for that one |
23:21
🔗
|
Kazzy |
will take a shot at that now |
23:23
🔗
|
hater |
i love this channel; someone ask for a feature and it is available in less than an hour |
23:30
🔗
|
|
zottelbey has quit (Remote host closed the connection) |
23:31
🔗
|
closure |
http://teamarchive1.fnf.archive.org/ia.bak/graph.html \o/ |
23:35
🔗
|
|
patricko- is now known as patrickod |
23:46
🔗
|
|
patrickod is now known as patricko- |