Time |
Nickname |
Message |
01:36
🔗
|
|
GLaDOS has quit (Ping timeout: 512 seconds) |
01:56
🔗
|
|
Kazzy has quit (Read error: Operation timed out) |
01:56
🔗
|
|
Kazzy (~Kaz@[redacted]) has joined #internetarchive.bak |
01:56
🔗
|
|
svchfoo1 gives channel operator status to Kazzy |
03:23
🔗
|
|
bzc6p__ (~bzc6p@[redacted]) has joined #internetarchive.bak |
03:29
🔗
|
|
bzc6p_ has quit (Ping timeout: 600 seconds) |
03:30
🔗
|
|
GLaDOS (~STR_IDENT@[redacted]) has joined #internetarchive.bak |
03:30
🔗
|
|
svchfoo1 gives channel operator status to GLaDOS |
03:42
🔗
|
|
GLaDOS has quit (Ping timeout: 260 seconds) |
03:45
🔗
|
|
GLaDOS (~STR_IDENT@[redacted]) has joined #internetarchive.bak |
03:45
🔗
|
|
svchfoo1 gives channel operator status to GLaDOS |
04:26
🔗
|
|
wp494 has quit (LOUD UNNECESSARY QUIT MESSAGES) |
05:28
🔗
|
|
wp494 (~wickedpla@[redacted]) has joined #internetarchive.bak |
06:04
🔗
|
|
zottelbey (~zottelbey@[redacted]) has joined #internetarchive.bak |
09:47
🔗
|
|
Start has quit (Ping timeout: 740 seconds) |
10:06
🔗
|
|
Start (~Start@[redacted]) has joined #internetarchive.bak |
10:06
🔗
|
|
svchfoo2 gives channel operator status to Start |
10:49
🔗
|
|
svchfoo2 has quit (Remote host closed the connection) |
10:51
🔗
|
|
svchfoo2 (~chfoo2@[redacted]) has joined #internetarchive.bak |
10:58
🔗
|
|
svchfoo1 gives channel operator status to svchfoo2 |
11:21
🔗
|
|
zottelbey has quit (Remote host closed the connection) |
11:25
🔗
|
|
bzc6p__ is now known as bzc6p |
12:00
🔗
|
|
zottelbey (~zottelbey@[redacted]) has joined #internetarchive.bak |
12:27
🔗
|
|
zottelbey has quit (Remote host closed the connection) |
13:12
🔗
|
|
zottelbey (~zottelbey@[redacted]) has joined #internetarchive.bak |
13:20
🔗
|
|
zottelbey has quit (Remote host closed the connection) |
13:35
🔗
|
|
zottelbey (~zottelbey@[redacted]) has joined #internetarchive.bak |
14:53
🔗
|
|
wp494 has quit (LOUD UNNECESSARY QUIT MESSAGES) |
15:28
🔗
|
sep332 |
1.5TB so far! |
15:58
🔗
|
|
bzc6p_ (~bzc6p@[redacted]) has joined #internetarchive.bak |
16:04
🔗
|
|
bzc6p has quit (Ping timeout: 600 seconds) |
16:40
🔗
|
|
patricko- is now known as patrickod |
16:57
🔗
|
SketchCow |
! |
16:58
🔗
|
SketchCow |
How big is the data in total? |
16:58
🔗
|
sep332 |
wiki says shard1 is 2.91TB |
17:00
🔗
|
SketchCow |
Fools, forces beyond your control, etc |
17:00
🔗
|
SketchCow |
We should probably have a second person |
17:00
🔗
|
SketchCow |
Anyone got space? |
17:07
🔗
|
|
patrickod is now known as patricko- |
17:23
🔗
|
closure |
I'm not going to do a full TB, but I'll pick up 500 or so gb |
17:24
🔗
|
SketchCow |
I want it separate from you. |
17:24
🔗
|
SketchCow |
You're on point for server and shard and watching what goes bazinga. |
17:25
🔗
|
SketchCow |
If need be, I will do it, but I'd rather it be outsider. |
17:25
🔗
|
sep332 |
closure: how many other people requested keys? |
17:25
🔗
|
closure |
so far 3 people have keys |
17:27
🔗
|
sep332 |
so just me, midas and underscor? |
17:35
🔗
|
SketchCow |
How much space is needed? |
17:35
🔗
|
SketchCow |
For this test set? |
17:36
🔗
|
closure |
it is 2.91 tb, but no one person has to take the whole thing |
17:38
🔗
|
SketchCow |
What OS should they be running |
17:38
🔗
|
closure |
linux or OSX |
17:40
🔗
|
|
closure finishes a (slow) stats run |
17:40
🔗
|
closure |
numcopies +0: 72540 |
17:40
🔗
|
closure |
numcopies +1: 30793 |
17:40
🔗
|
closure |
numcopies +2: 10 |
17:40
🔗
|
SketchCow |
Calling out. |
17:40
🔗
|
closure |
+0 is only in IA, +1 one backup, etc |
17:41
🔗
|
closure |
that's by files |
17:41
🔗
|
|
balrog (~balrog@[redacted]) has joined #internetarchive.bak |
17:41
🔗
|
closure |
nice call out |
17:42
🔗
|
SketchCow |
Yes. |
17:42
🔗
|
SketchCow |
Well, time to up the test, man. |
17:42
🔗
|
closure |
oh btw guys, I forgot to mention this little config option: git config annex.diskreserve 200GB |
17:43
🔗
|
closure |
then it'll leave 200 gb free when you let it rip |
17:44
🔗
|
balrog |
oh, doing git-annex? |
17:44
🔗
|
closure |
right, for git-annex get |
17:44
🔗
|
balrog |
(currently I'm hitting some FTPs) |
17:46
🔗
|
closure |
git config annex.web-options=--limit-rate=200k |
17:46
🔗
|
closure |
you can also configure it to tell wget to --limit-rate for bandwidth |
17:48
🔗
|
SketchCow |
So. |
17:48
🔗
|
SketchCow |
People are going to start to arrive in here, over the next day or two. |
17:48
🔗
|
SketchCow |
I'm primarily aiming them at you, closure. |
17:48
🔗
|
SketchCow |
We should probably wiki instructions, so feel free to make subpages or new pages on the wiki. |
17:51
🔗
|
SketchCow |
I've got three volunteers so far. |
17:52
🔗
|
balrog |
so I currently have ~6TB of slow but open storage, and I'm bringing 12TB online this week. expanding to 30TB over time. |
17:52
🔗
|
balrog |
Obviously I don't want to use all of this for IA.bak |
17:52
🔗
|
balrog |
(right now I'm hitting FTP servers) |
17:59
🔗
|
SketchCow |
I don't think you should use more than 500gb at this point. |
18:00
🔗
|
|
Owen-x (~owen@[redacted]) has joined #internetarchive.bak |
18:01
🔗
|
SketchCow |
8 people are interested. |
18:01
🔗
|
SketchCow |
We'll see if the IRC hurdle is an issue. |
18:01
🔗
|
SketchCow |
closure: What's in the test set, by the way. |
18:02
🔗
|
closure |
It's internetarchivebooks and usenethistorical collections |
18:03
🔗
|
SketchCow |
Aww, that's a nice set. |
18:04
🔗
|
raylee |
yeah, I noticed the tweet, but I don't have the hd space to spare right now, busy archiving other stuff |
18:04
🔗
|
raylee |
(as i told you in PM SketchCow, just didn't know a discussion about it was going on here) |
18:08
🔗
|
Owen-x |
Hello - I've got 1TB online here but could probably spare 3 or 4 in the near future |
18:08
🔗
|
Owen-x |
What do I do? |
18:09
🔗
|
closure |
Owen-x: take a read over this page: http://archiveteam.org/index.php?title=INTERNETARCHIVE.BAK/git-annex_implementation#shard1 |
18:09
🔗
|
Owen-x |
ah OK, that one |
18:09
🔗
|
SketchCow |
Owen-x: If there's clarifications you want or issues you hit, please bring them, so closure can make the instructions more definite. |
18:10
🔗
|
closure |
yes, this is all very rough and we're learning by doing |
18:10
🔗
|
Owen-x |
OK, will do, I'll work through it now |
18:12
🔗
|
|
jxyzn (cfbc1136@[redacted]) has joined #internetarchive.bak |
18:12
🔗
|
|
yuppie_ (webchat@[redacted]) has joined #internetarchive.bak |
18:13
🔗
|
|
yuppie_ has quit (Client Quit) |
18:14
🔗
|
midas |
closure: i think we should add that ppa to the list |
18:15
🔗
|
closure |
midas: go ahead |
18:15
🔗
|
Owen-x |
OK, how do I get my id_rsa.pub added so I can clone the git? |
18:15
🔗
|
midas |
im currently on my cellphone but ill add it when i get home :p |
18:15
🔗
|
closure |
Owen-x: you can just send it to me |
18:16
🔗
|
Owen-x |
email? |
18:16
🔗
|
closure |
/msg is fine |
18:16
🔗
|
Owen-x |
OK |
18:16
🔗
|
closure |
it's a public key, you could take out an add in the NYT ;) |
18:18
🔗
|
closure |
Owen-x: key added |
18:18
🔗
|
midas |
but dont, the ads in the nyt are expensive. |
18:18
🔗
|
closure |
yes, buy TB with that $$$ instead |
18:19
🔗
|
Owen-x |
whoop! Here we go |
18:19
🔗
|
SketchCow |
---------------------------------------------------------------------- |
18:20
🔗
|
SketchCow |
If ANYONE in here is comfortable working with closure to make |
18:20
🔗
|
SketchCow |
a nice graphic interface to the activities of the server, |
18:20
🔗
|
|
jxyzn (cfbc1136@[redacted]) has left #internetarchive.bak |
18:20
🔗
|
SketchCow |
speak up. Bonus for crazy loopy stuff |
18:20
🔗
|
SketchCow |
---------------------------------------------------------------------- |
18:20
🔗
|
|
jxyzn (607e660e@[redacted]) has joined #internetarchive.bak |
18:20
🔗
|
midas |
woop woop craaazy loops |
18:21
🔗
|
midas |
closure: what kind of data are you getting back for graphs etc? |
18:21
🔗
|
closure |
all it gets back is pushes of the git-annex branch that say what file was in what clone, when |
18:23
🔗
|
midas |
any examples? |
18:23
🔗
|
midas |
(pastebin or something) |
18:27
🔗
|
closure |
http://pastebin.com/ZyQiRTSC |
18:28
🔗
|
closure |
it would be doable to mine those git commits and do some status display showing files that are backed up and not, or number of bytes stored |
18:28
🔗
|
sep332 |
how did you get that list earlier, of +0 +1 +2 files? |
18:29
🔗
|
closure |
"MD5-s28966-" is a file of 28966 bytes |
18:29
🔗
|
closure |
I used "git annex status ." in the repo for that, but it takes a couple minutes to read everything |
18:30
🔗
|
midas |
hm |
18:30
🔗
|
midas |
i was hoping on some more eatable data so i could use google charts |
18:33
🔗
|
midas |
SketchCow: food for tought, speed getting data from IA. im on a 200Mbit pipe here and barely hitting 200KB/s. ofcourse it depends on location what kind of speed someone will get. |
18:33
🔗
|
midas |
but it took me ~2 days for 40GB already |
18:56
🔗
|
|
hater (bneA465dS9@[redacted]) has joined #internetarchive.bak |
19:56
🔗
|
|
garyrh gives channel operator status to arkiver balrog chfoo db48x |
19:56
🔗
|
|
garyrh gives channel operator status to underscor |
20:00
🔗
|
|
bzc6p_ is now known as bzc6p |
20:21
🔗
|
SketchCow |
Understood. |
20:22
🔗
|
SketchCow |
midas - Where are you pulling from |
20:22
🔗
|
SketchCow |
Is this teamarchive1? |
20:23
🔗
|
midas |
grabbing from Location: https://ia600505.us.archive.org/15/items/americaneducator08fost/americaneducator08fost_orig_jp2.tar [following] |
20:23
🔗
|
SketchCow |
So, that's DIRECTLY from archive. |
20:23
🔗
|
SketchCow |
1. yay |
20:23
🔗
|
midas |
yeah |
20:23
🔗
|
SketchCow |
2. ohh |
20:33
🔗
|
|
richo (~richo@[redacted]) has joined #internetarchive.bak |
20:45
🔗
|
SketchCow |
I've provided a number of people links to the channel. Not sure how many will show. |
20:52
🔗
|
raylee |
k |
20:57
🔗
|
closure |
zottelbey: I've added your keu |
20:57
🔗
|
closure |
er, key |
20:57
🔗
|
zottelbey |
thanks |
21:16
🔗
|
|
zottelbey has quit (Remote host closed the connection) |
21:22
🔗
|
SketchCow |
I've been informed there's a global load balancing issue with the machines right now at the IA, so that's why some items are slow. We should proceed, but that's why. |
21:39
🔗
|
|
wp494 (~wickedpla@[redacted]) has joined #internetarchive.bak |
21:41
🔗
|
midas |
okay, thanks for the heads up SketchCow |
21:52
🔗
|
hater |
who is in charge of the git-annex-implementation or the server hosting the files (or the git-repo)? |
21:52
🔗
|
SketchCow |
closure. |
21:52
🔗
|
SketchCow |
he's running the test. |
21:53
🔗
|
hater |
SketchCow: thx |
21:55
🔗
|
hater |
closure: are you recording the serverload per client? |
21:58
🔗
|
hater |
and do you need some help? |
22:46
🔗
|
|
wp494 has quit (LOUD UNNECESSARY QUIT MESSAGES) |
22:52
🔗
|
|
Owen-x has quit (Owen-x) |
22:53
🔗
|
|
wp494 (~wickedpla@[redacted]) has joined #internetarchive.bak |
22:56
🔗
|
SketchCow |
Closure may have reasonable waking hours, so he might not get back yet. |
23:08
🔗
|
closure |
server's load is 0 0 0 |
23:09
🔗
|
|
Owen-x (~owen@[redacted]) has joined #internetarchive.bak |
23:10
🔗
|
hater |
closure: is anyone downloading files atm? |
23:11
🔗
|
closure |
sure, from the IA. No idea what their load is ;) |
23:11
🔗
|
closure |
our server just gets a git push from time to time, I imagine it will scale quite a way |
23:20
🔗
|
|
Owen-x has quit (Owen-x) |
23:20
🔗
|
|
jxyzn has quit (Quit: http://chat.efnet.org (Session timeout)) |
23:22
🔗
|
|
aschmitz has quit (Read error: Connection reset by peer) |
23:41
🔗
|
|
Owen-x (~owen@[redacted]) has joined #internetarchive.bak |