Time |
Nickname |
Message |
00:34
🔗
|
SketchCow |
I see the drop on shard3 |
00:39
🔗
|
tpw_rules |
bah i think running 8 iabaks is going to be faster than one with -J8 |
00:43
🔗
|
tpw_rules |
i'm going to try re-encoding some of this .dv because it is scary big |
01:05
🔗
|
Senji |
SketchCow: ahh, the expire *finally* happened |
01:07
🔗
|
Senji |
mariusz: I find you need about 500MB free space to run sync |
01:09
🔗
|
Senji |
tpw_rules: I run between about 5 and 20 iabaks in parallel depending on what I'm doing right now. But only for 4 hours a day :( |
01:41
🔗
|
|
primus104 has quit IRC (Leaving.) |
01:49
🔗
|
|
JesseW has joined #internetarchive.bak |
02:20
🔗
|
|
patrickod has quit IRC (Ping timeout: 258 seconds) |
02:20
🔗
|
|
patrickod has joined #internetarchive.bak |
02:22
🔗
|
|
tpw-rules has joined #internetarchive.bak |
02:23
🔗
|
|
db48x has quit IRC (hub.efnet.us irc.Prison.NET) |
02:23
🔗
|
|
dirt has quit IRC (hub.efnet.us irc.Prison.NET) |
02:23
🔗
|
|
midas has quit IRC (hub.efnet.us irc.Prison.NET) |
02:23
🔗
|
|
tpw_rules has quit IRC (hub.efnet.us irc.Prison.NET) |
02:24
🔗
|
|
dirt_ has joined #internetarchive.bak |
02:25
🔗
|
|
midas1 has joined #internetarchive.bak |
02:32
🔗
|
tpw-rules |
if i nuke all the symlinks, how can i get them back? |
02:39
🔗
|
|
dirt_ is now known as dirt |
02:39
🔗
|
|
tpw-rules is now known as tpw_rules |
02:43
🔗
|
tpw_rules |
basically, can i reconstruct .git/objects and .git/refs and all the symlinks without reiniting the repo |
03:06
🔗
|
|
jbenet__ has quit IRC (Remote host closed the connection) |
03:09
🔗
|
|
mattl has quit IRC (Quit: X_X) |
03:23
🔗
|
tpw_rules |
warning: it's about to look like i lost everything |
03:23
🔗
|
tpw_rules |
but it will come back shortly |
03:35
🔗
|
tpw_rules |
okay everything should be fixed |
03:37
🔗
|
tpw_rules |
Senji: have you experimented with running multiple iabaks that are each running in parallel? |
03:39
🔗
|
tpw_rules |
hm |
03:39
🔗
|
tpw_rules |
we also have files that appear to be binary images of CDs. why not turn them into FLAC? |
04:31
🔗
|
DFJustin |
basically IA has a policy of never fucking with what people upload |
04:32
🔗
|
DFJustin |
it generates compressed streamable versions but the original is always kept |
04:33
🔗
|
DFJustin |
so, they could work with material donors to use more suitable formats but what's there is gonna stay |
04:33
🔗
|
DFJustin |
unless the donor goes through and redoes it all |
04:34
🔗
|
tpw_rules |
ah. why? time/cpu? |
04:34
🔗
|
tpw_rules |
concern that it might get damaged? |
04:36
🔗
|
DFJustin |
I don't work there so I can't really speak to why |
04:36
🔗
|
tpw_rules |
i don't profess to understand but 'never' seems a bit extreme to me |
04:36
🔗
|
tpw_rules |
ah well |
04:36
🔗
|
DFJustin |
it's just against their whole way of doing things |
04:38
🔗
|
DFJustin |
I mean they have an entire warehouse full of books that they've already scanned, just to have them still |
04:38
🔗
|
SketchCow |
Sigh |
04:39
🔗
|
tpw_rules |
is $80 a good deal for a 3TB hard drive |
04:40
🔗
|
DFJustin |
http://www.edwardbetts.com/price_per_tb/ |
04:41
🔗
|
tpw_rules |
yes. i can get that top one $10 less per drive for a deal of 3 |
04:41
🔗
|
tpw_rules |
i may do it because i like knowing i have ridiculous amounts of data |
04:42
🔗
|
tpw_rules |
although holy hell those samsungs |
04:42
🔗
|
tpw_rules |
i didn't expect an external to be cheaper |
04:43
🔗
|
tpw_rules |
although at some point i'll have to stop since i can't keep doing this for 1762 more shards |
04:46
🔗
|
DFJustin |
they've been cheaper for quite some time, price discrimination for consumer vs enterprise buyers or some such |
04:46
🔗
|
DFJustin |
IA buys shitloads of externals and then takes the covers off |
04:46
🔗
|
tpw_rules |
interesting |
04:47
🔗
|
tpw_rules |
which specifically? |
04:47
🔗
|
tpw_rules |
eh it's getting time for sleep. night |
04:47
🔗
|
tpw_rules |
currently running at about 10 floppies/second |
05:29
🔗
|
|
JesseW has quit IRC (Ping timeout: 265 seconds) |
05:30
🔗
|
|
JesseW has joined #internetarchive.bak |
06:13
🔗
|
|
zz_CyberJ is now known as CyberJaco |
06:16
🔗
|
|
jbenet__ has joined #internetarchive.bak |
06:27
🔗
|
|
mattl has joined #internetarchive.bak |
06:38
🔗
|
|
ryang has quit IRC (Remote host closed the connection) |
07:00
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
07:07
🔗
|
iabak-reg |
03registrar 05master 3aad84a 06other 10SHARD8/pubkeys registration of brian on SHARD8 |
07:19
🔗
|
|
ryang has joined #internetarchive.bak |
07:24
🔗
|
|
CyberJaco is now known as zz_CyberJ |
07:54
🔗
|
Senji |
tpw_rules: not significantly; I like my progress bars too much :) |
08:16
🔗
|
|
primus104 has joined #internetarchive.bak |
08:27
🔗
|
|
primus105 has joined #internetarchive.bak |
08:33
🔗
|
|
primus104 has quit IRC (Read error: Operation timed out) |
08:52
🔗
|
|
hendi__ has joined #internetarchive.bak |
08:54
🔗
|
|
primus105 has quit IRC (Leaving.) |
09:03
🔗
|
|
primus104 has joined #internetarchive.bak |
09:03
🔗
|
|
primus104 has quit IRC (Client Quit) |
10:50
🔗
|
|
midas1 is now known as midas |
11:42
🔗
|
|
db48x has joined #internetarchive.bak |
12:00
🔗
|
SketchCow |
DFJustin: IA does not buy shitloads of externals anymore. |
12:01
🔗
|
SketchCow |
They have deals with dealers - it's just when the crisis hit due to the flooding, they found this whack-ass channel for a while. |
12:01
🔗
|
SketchCow |
Also, they've jumped to 6tb and 8tbs |
12:02
🔗
|
ppiixx |
i think backblaze still strip down external drives |
12:06
🔗
|
|
mariusz has joined #internetarchive.bak |
12:45
🔗
|
|
sankin has joined #internetarchive.bak |
12:46
🔗
|
|
sankin has quit IRC (Client Quit) |
12:51
🔗
|
|
sankin has joined #internetarchive.bak |
13:30
🔗
|
sep332 |
tpw_rules: have you looked at the IA page for the huge dv file? |
13:31
🔗
|
sep332 |
wouldn't surprise me if there's already a re-encoded version there |
13:34
🔗
|
SketchCow |
There is, he's just bothered that we are archiving the original instead of the re-encode. |
13:34
🔗
|
SketchCow |
That's the breaks. |
13:34
🔗
|
sep332 |
yeah, ok |
13:49
🔗
|
|
primus104 has joined #internetarchive.bak |
13:50
🔗
|
|
hendi_ has joined #internetarchive.bak |
13:58
🔗
|
|
hendi__ has quit IRC (Ping timeout: 512 seconds) |
13:58
🔗
|
|
Start has quit IRC (Disconnected.) |
14:37
🔗
|
|
Start has joined #internetarchive.bak |
14:54
🔗
|
|
Start has quit IRC (Disconnected.) |
14:55
🔗
|
|
Start has joined #internetarchive.bak |
14:56
🔗
|
|
primus104 has quit IRC (Leaving.) |
15:02
🔗
|
tpw_rules |
yeah i'm whiny |
15:03
🔗
|
tpw_rules |
i'm getting lots of failures with shard 8 |
15:04
🔗
|
Senji |
Is your internet connection flaky? |
15:04
🔗
|
tpw_rules |
i don't think so |
15:05
🔗
|
tpw_rules |
it never has been |
15:05
🔗
|
Senji |
I've not had any problems downloading files over 100GB; even though it's taken weeks |
15:05
🔗
|
tpw_rules |
me either, but my connection is much faster |
15:06
🔗
|
tpw_rules |
https://i.imgur.com/2fANPGC.png |
15:06
🔗
|
tpw_rules |
i'm not sure how it decides 'failed' |
15:06
🔗
|
tpw_rules |
i remember having issues with shard 1 when things got shut down and it couldn't find them. has the same happened to shard 8? |
15:06
🔗
|
Senji |
I don't think your connection is *much* faster; but you're probably downloading more than 4 hours a day :-) |
15:07
🔗
|
tpw_rules |
oh yeah, that would put a damper on things |
15:07
🔗
|
tpw_rules |
how do you make it resume? |
15:08
🔗
|
Senji |
I 'killall -STOP wget' at 6am every morning and 'killall -CONT wget' at 2am (via cronjob) |
15:09
🔗
|
tpw_rules |
why do you have to do that? |
15:09
🔗
|
Senji |
shard3 had some files that can't download because the filename quoting is wrong |
15:10
🔗
|
Senji |
Usage charging; it's essnetially free during those four hours overnight. |
15:10
🔗
|
tpw_rules |
ohhhh :'( |
15:10
🔗
|
tpw_rules |
that mucho sucks |
15:16
🔗
|
tpw_rules |
how much storage do you have? i'm concerned about buying more before i look back and i've spend $7000 |
15:16
🔗
|
Senji |
Umm, at the moment I'm working on lying around bits of storage that's under my desk. Maybe 6-8TB in total? |
15:16
🔗
|
tpw_rules |
ahh |
15:17
🔗
|
tpw_rules |
that's what i've been doing but i'm running out |
15:17
🔗
|
tpw_rules |
http://www.newegg.com/Product/Product.aspx?Item=N82E16822152425 kinda considering a couple of those |
15:17
🔗
|
Senji |
db48x: a passing systemd expert points at http://www.freedesktop.org/software/systemd/man/sd_booted.html as documeinting /run/systemd/system/ as the canonical test for systemd |
15:19
🔗
|
tpw_rules |
it also doesn't look like syncing is happening |
15:20
🔗
|
Senji |
I've had the occasional problem with hourlysync dying |
15:20
🔗
|
* |
tpw_rules starts in another terminal |
15:21
🔗
|
Senji |
Those drives look very cheap; but maybe that's just the thing where US to UK pricing is a bit odd for computer hardware |
15:22
🔗
|
Senji |
I have a http://www.amazon.co.uk/gp/product/B00JV1YQY0 that I use in my media-centre-pc (not for iabak :)) which cost me £119.99 |
15:24
🔗
|
tpw_rules |
well in theory it's the same drive |
15:25
🔗
|
tpw_rules |
lol the guy who asked seagate for a new motherboard |
15:26
🔗
|
tpw_rules |
i have a stack of 5TB wd red NAS drives in my server but they're pretty expensive |
15:27
🔗
|
Senji |
I think you'd want greens for this. Purples maybe if you think it's more of a 24/7 usage pattern |
15:27
🔗
|
tpw_rules |
well i'm also very cheap |
15:27
🔗
|
tpw_rules |
i can get good seagate enclosures for like $16 a pop on ebay |
15:28
🔗
|
tpw_rules |
it looks like the chipset is having problems with 5TB tho |
15:29
🔗
|
tpw_rules |
that box on newegg has a lot of the same complaints as the samsung |
15:29
🔗
|
tpw_rules |
i'll have to monitor spinning of my drives. not sure how good linux is at spinning them down over usb |
15:32
🔗
|
|
JesseW has joined #internetarchive.bak |
15:44
🔗
|
|
Start has quit IRC (Disconnected.) |
15:56
🔗
|
|
JesseW has quit IRC (Quit: Leaving.) |
16:03
🔗
|
|
Start has joined #internetarchive.bak |
16:21
🔗
|
iabak-reg |
03registrar 05master 902ec21 06other 10SHARD3/pubkeys registration of mail on SHARD3 |
16:33
🔗
|
|
hendi_ has quit IRC (ircd.choopa.net irc.eversible.com) |
16:33
🔗
|
|
Lord_ has quit IRC (ircd.choopa.net irc.eversible.com) |
16:33
🔗
|
|
espes__ has quit IRC (ircd.choopa.net irc.eversible.com) |
16:33
🔗
|
|
svchfoo3 has quit IRC (ircd.choopa.net irc.eversible.com) |
16:33
🔗
|
|
bpye has quit IRC (ircd.choopa.net irc.eversible.com) |
16:38
🔗
|
|
espes___ has joined #internetarchive.bak |
16:42
🔗
|
|
bpye_ has joined #internetarchive.bak |
16:43
🔗
|
|
hendi has joined #internetarchive.bak |
16:43
🔗
|
|
Lord__ has joined #internetarchive.bak |
16:43
🔗
|
|
Start has quit IRC (Disconnected.) |
17:22
🔗
|
|
primus104 has joined #internetarchive.bak |
18:38
🔗
|
tpw_rules |
closure: when is a new build with --incomplete coming? |
18:54
🔗
|
|
Start has joined #internetarchive.bak |
18:58
🔗
|
tpw_rules |
i think the fsck is locking too much |
18:59
🔗
|
tpw_rules |
it seems to hold a lock on the archive while the checksum is happening so there's a lot of time that ./iabak is waiting to get a new file to download |
19:06
🔗
|
tpw_rules |
it looks like a lot of stuff is going from cdbbsarchive |
19:06
🔗
|
tpw_rules |
also closure i have even more crazy ideas |
19:24
🔗
|
|
Start has quit IRC (Disconnected.) |
20:08
🔗
|
|
mariusz has quit IRC (Read error: Operation timed out) |
20:51
🔗
|
|
hendi has quit IRC (Ping timeout: 259 seconds) |
20:59
🔗
|
|
sankin has quit IRC (Leaving.) |
22:09
🔗
|
|
garyrh has quit IRC (Read error: Connection reset by peer) |
22:27
🔗
|
|
Start has joined #internetarchive.bak |
22:33
🔗
|
|
garyrh has joined #internetarchive.bak |
22:54
🔗
|
|
Apathy_ has joined #internetarchive.bak |
23:14
🔗
|
Senji |
tpw_rules: I've not noticed any locking between fscking and getting; and one of my shards takes more than 5 hours to fsck |
23:14
🔗
|
Senji |
(cleopatra is Realyl Slowâ„¢ by modern computer standards) |
23:15
🔗
|
Senji |
Various modes of failing to get a file take *forever* though; I think waiting for timeouts in wget |
23:41
🔗
|
|
mntasauri has quit IRC (Max SendQ exceeded) |
23:41
🔗
|
|
mntasauri has joined #internetarchive.bak |
23:43
🔗
|
|
mntasauri has quit IRC (Max SendQ exceeded) |
23:43
🔗
|
|
mntasauri has joined #internetarchive.bak |
23:46
🔗
|
|
primus104 has quit IRC (hub.se irc.efnet.pl) |