#internetarchive.bak 2016-11-16,Wed

↑back Search

Time Nickname Message
00:04 🔗 db48x has quit IRC (Quit: train)
00:54 🔗 godane has joined #internetarchive.bak
01:10 🔗 n00b428 has joined #internetarchive.bak
01:12 🔗 n00b428 Hello all. I'm having some problems with a new install on Ubuntu
01:13 🔗 n00b428 verification of content failed; Unable to access these remotes: web; Try making some of these repositories available:
01:13 🔗 n00b428 Any thoughts?
01:21 🔗 iabak-reg 03registrar 05master cf24b43 06other 10SHARD10/pubkeys registration of octobyt3 on SHARD10
01:23 🔗 n00b428 Alrighty then....
01:23 🔗 n00b428 has quit IRC (Quit: Page closed)
01:26 🔗 iabak-reg 03registrar 05master 563b210 06other 10SHARD4/pubkeys registration of octobyt3 on SHARD4
01:57 🔗 sevs has joined #internetarchive.bak
03:09 🔗 closure thelsdj: build is ready: https://downloads.kitenet.net/git-annex/autobuild/armel/git-annex-standalone-armel.tar.gz
03:12 🔗 thelsdj closure: huh, still doesn't work, same error
03:14 🔗 iabak-reg 03registrar 05master 871cec1 06other 10SHARD3/pubkeys registration of mike on SHARD3
03:21 🔗 cmaldonad has joined #internetarchive.bak
03:23 🔗 closure ah, I see, the way I wrapped the linker didn't work
03:24 🔗 cmaldonad has quit IRC (Client Quit)
03:25 🔗 cmaldonad has joined #internetarchive.bak
03:28 🔗 closure also interestingly, I notice that binaries such as git on arm all do use a 64k page size.
03:28 🔗 closure except for haskell ones, which are linked with ld.gold.
03:29 🔗 closure probably the bug is there
03:29 🔗 thelsdj yep, the others work
03:36 🔗 iabak-reg 03registrar 05master 5a5cc9a 06other 10SHARD12/pubkeys registration of kyle on SHARD12
03:45 🔗 cmaldonad has quit IRC (Quit: This computer has gone to sleep)
03:58 🔗 closure gar, it's rebuilding from the top
04:06 🔗 SketchCow Signups are going gangbusters!
04:24 🔗 Lord_Nigh SketchCow: if we can get the shard providing thing into a package format that will easily run on a nas using a settable amount of space, you'd have hundreds more volunteers
05:08 🔗 SketchCow Agreed
05:08 🔗 SketchCow Now who wants to do it?
05:48 🔗 yipdw oh, interesting
05:48 🔗 yipdw ia-mine's using IA's advanced search API, but uh
05:48 🔗 yipdw https://archive.org/advancedsearch.php?q=collection:coverartarchive&page=202&rows=50&output=json
05:48 🔗 yipdw I think this means we'll only ever get the first 10,000 results of a collection
05:49 🔗 * yipdw checks out the scraping API
05:52 🔗 Lord_Nigh btw i noticed something
05:52 🔗 Lord_Nigh ...crap wrong channel
06:36 🔗 kyan has quit IRC (Quit: Leaving)
07:04 🔗 sevs I'm kinda curious, had to restart the iabak process and now it's doing "get MD5-<something> (from web...)" followed by "(checksum...) ok" over and over
07:05 🔗 sevs before it downloaded files but now it's using way less traffic
07:05 🔗 sevs is this expected/normal?
07:24 🔗 Zippit has quit IRC (Ping timeout: 260 seconds)
07:35 🔗 minus_ has joined #internetarchive.bak
07:53 🔗 minus_ has quit IRC (Quit: Bye)
07:54 🔗 minus_ has joined #internetarchive.bak
08:33 🔗 iabak-reg 03registrar 05master f264d8d 06other 10SHARD14/pubkeys registration of octobyt3 on SHARD14
08:36 🔗 iabak-reg 03registrar 05master b9fb989 06other 10SHARD16/pubkeys registration of octobyt3 on SHARD16
08:55 🔗 Zippit has joined #internetarchive.bak
08:59 🔗 Zippit has quit IRC (Client Quit)
09:00 🔗 sevs has quit IRC (Ping timeout: 268 seconds)
11:43 🔗 CyberJaco is now known as zz_CyberJ
11:46 🔗 iabak-reg 03registrar 05master 251c42c 06other 10SHARD9/pubkeys registration of stefan on SHARD9
11:51 🔗 iabak-reg 03registrar 05master c1011a5 06other 10SHARD10/pubkeys registration of mr.business1148 on SHARD10
11:51 🔗 cmaldonad has joined #internetarchive.bak
12:06 🔗 iabak-reg 03registrar 05master ab9a828 06other 10SHARD10/pubkeys registration of mrote on SHARD10
12:27 🔗 TGMMilenk is now known as milenko
12:31 🔗 atomotic has joined #internetarchive.bak
12:43 🔗 atomotic has quit IRC (Read error: Connection timed out)
12:51 🔗 milenko So my client is currently pulling down a ~150GB /.git/annex/MD5-* file
12:51 🔗 markaro has joined #internetarchive.bak
12:51 🔗 milenko Client says it's coming in at ~2MB/s - but the host is only showing around 100kb/s
12:52 🔗 milenko That doesn't seem normal
13:04 🔗 cmaldonad has quit IRC (Quit: This computer has gone to sleep)
13:17 🔗 cmaldonad has joined #internetarchive.bak
13:42 🔗 cmaldonad has quit IRC (Quit: This computer has gone to sleep)
13:51 🔗 iabak-reg 03registrar 05master 95506be 06other 10SHARD12/pubkeys registration of mrote on SHARD12
13:56 🔗 iabak-reg 03registrar 05master fb1daa3 06other 10SHARD9/pubkeys registration of mrote on SHARD9
14:02 🔗 iabak-reg 03registrar 05master 477ee58 06other 10SHARD12/pubkeys registration of mariabak on SHARD12
14:05 🔗 iabak-reg 03registrar 05master a0e9e40 06other 10SHARD10/pubkeys registration of mariabak on SHARD10
14:08 🔗 markaro has quit IRC ()
14:25 🔗 iabak-reg 03registrar 05master e3a6430 06other 10SHARD10/pubkeys registration of mariabak on SHARD10
14:29 🔗 iabak-reg 03registrar 05master bafa0a5 06other 10SHARD9/pubkeys registration of mariabak on SHARD9
14:41 🔗 iabak-reg 03registrar 05master eb7d6cf 06other 10SHARD3/pubkeys registration of iabakmar on SHARD3
14:47 🔗 iabak-reg 03registrar 05master f5e911c 06other 10SHARD16/pubkeys registration of iabakmar on SHARD16
14:49 🔗 sep332_ has quit IRC (konversation out)
14:51 🔗 sep332_ has joined #internetarchive.bak
14:52 🔗 cmaldonad has joined #internetarchive.bak
14:54 🔗 Start has quit IRC (Quit: Disconnected.)
14:55 🔗 cmaldonad has quit IRC (Client Quit)
14:58 🔗 markaro has joined #internetarchive.bak
15:16 🔗 SketchCow Excellent
15:23 🔗 iabak-reg 03registrar 05master 4b08ea7 06other 10SHARD3/pubkeys registration of mitch on SHARD3
17:10 🔗 markaro has quit IRC ()
17:24 🔗 iabak-reg 03registrar 05master dd40f18 06other 10SHARD15/pubkeys registration of octobyt3 on SHARD15
17:40 🔗 sevs has joined #internetarchive.bak
17:44 🔗 computerf has quit IRC (Read error: Operation timed out)
18:12 🔗 atomotic has joined #internetarchive.bak
18:13 🔗 computerf has joined #internetarchive.bak
18:20 🔗 atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)
18:51 🔗 kyan has joined #internetarchive.bak
19:13 🔗 iabak-reg 03registrar 05master 6d7eea2 06other 10SHARD10/pubkeys registration of iabakmar on SHARD10
19:15 🔗 iabak-reg 03registrar 05master 299f6ad 06other 10SHARD3/pubkeys registration of iabakmar on SHARD3
19:42 🔗 Start has joined #internetarchive.bak
20:09 🔗 Start has quit IRC (Remote host closed the connection)
20:10 🔗 Blackout So I think I saw this mentioned earlier but is an encrypted rclone mount on google drive or ACD potentially fair game for storage?
20:13 🔗 SketchCow If you can engineer storage that passes the fixity and interaction check it's up for grabs.
20:13 🔗 Blackout So you guys are syncing data and periodically verifying it?
20:13 🔗 SketchCow Yes
20:14 🔗 Blackout Ok because I have a 1g/1g unmetered line but I don't keep a ton of storage local
20:14 🔗 Blackout I've done 22TB of transit this month and heard no complaints from my ISP lol
20:14 🔗 iabak-reg 03registrar 05master d2fb17f 06other 10SHARD3/pubkeys registration of mike on SHARD3
20:15 🔗 Blackout SketchCow, What kind of file sizes are we talking about?
20:15 🔗 SketchCow Individual files or chunks?
20:16 🔗 Blackout Like am I syncing 50G megawarcs or 10 million tiny files
20:16 🔗 Blackout Purely for the sake of filesystem requirements
20:18 🔗 SketchCow Could be either
20:18 🔗 SketchCow No way to tell
20:19 🔗 Blackout Ok so they're not wrapped by you guys?
20:23 🔗 closure Blackout: git-annex can chunk larger files to smaller chunks when using a special remote; there are some special remotes supporting google drive
20:27 🔗 SketchPho has quit IRC (Quit: Connection closed for inactivity)
20:49 🔗 Blackout closure, so I'm not familiar with git-annex. Would that negate the need to mount the drive via rclone?
20:50 🔗 sevs So I've been running iabak since yesterday on two machines, the folders are 302G and 318G, however the values on the stat page sit at 50 and 80G - am I doing something wrong?
20:51 🔗 Start has joined #internetarchive.bak
20:52 🔗 Start has quit IRC (Client Quit)
20:52 🔗 SketchCow sevs: Be patient as we work it out. Some of the devs here might ask you some questions so we can check the reporting mechanism.
20:53 🔗 SketchCow It might be as simple as closure has it running once a day.
20:53 🔗 sevs SketchCow: ahh, ok
20:54 🔗 sevs Was just confused since it apparently was updating every 10 minutes
20:55 🔗 Start has joined #internetarchive.bak
20:55 🔗 sevs ^since the stats seemed to be updating every 10 minutes
20:59 🔗 Start has quit IRC (Client Quit)
21:00 🔗 voovik198 has joined #internetarchive.bak
21:02 🔗 voovik198 has left
21:03 🔗 Start has joined #internetarchive.bak
21:10 🔗 DFJustin JASON GO BACK TO FUCKING SLEEP
21:16 🔗 SketchCow So maaaany yyy thiiinggss to dooooo
21:36 🔗 n00b473 has joined #internetarchive.bak
21:41 🔗 closure thelsdj: build is ready (second try): https://downloads.kitenet.net/git-annex/autobuild/armel/git-annex-standalone-armel.tar.gz
21:42 🔗 thelsdj trying it out
21:43 🔗 thelsdj works! now to see if everything else works on drobo
21:43 🔗 closure sevs: your progress is only synced back periodically, much less frequently than 10 minutes. Give it a couple of days
21:43 🔗 closure thelsdj: damn, nice!
21:43 🔗 closure so was that on the WD NAS?
21:44 🔗 closure I was only able to get it to build with a 32kb page size, not 64k
21:44 🔗 thelsdj its a Drobo 5N
21:44 🔗 thelsdj which has 16k page size
21:44 🔗 thelsdj i get this when running iabak a second time: fatal: unable to access 'https://github.com/ArchiveTeam/IA.BAK/': Problem with the SSL CA cert (path? access rights?)
21:44 🔗 thelsdj trying rest but not sure if that is fatal
21:45 🔗 closure Blackout: http://git-annex.branchable.com/tips/using_Google_Cloud_Storage/
21:45 🔗 thelsdj Drobo also doesn't have 'tempfile' command by default
21:46 🔗 closure probably the NAS doesn't have a ssl cert store. https is only used for cloning the IA.BAK repo
21:46 🔗 closure IA.BAK probably will need some porting for such embedded systems.
21:47 🔗 thelsdj yeah its close to being a functional linux but doesn't have a lot of expected helper apps and such
21:48 🔗 thelsdj not sure why the git clone fails since i'm already in the git directory i cloned?
21:50 🔗 thelsdj ah doesn't have a cron by default either but i think i can install one
21:50 🔗 closure it does a git pull to update itself
21:51 🔗 thelsdj the git pull works it says 'Already up-to-date.' at the top, but then i get that SSL CA cert error a few lines down
21:54 🔗 Start has quit IRC (Remote host closed the connection)
21:55 🔗 closure thelsdj: iirc you said some device needed 64kb page size?
21:57 🔗 thelsdj appears that WD My Cloud devices at least in newer firmware versions have 64k page size
21:57 🔗 closure so, not something you can test?
21:57 🔗 closure unfortunatly 64kb page size causes ld.gold to fail with internal error. bugs filed etc
21:58 🔗 thelsdj nope, i don't have one, but that was what the original bug i saw had so might be worth reaching out to the person who reported it https://git-annex.branchable.com/bugs/git-annex_won__39__t_execute_on_WD_My_Cloud_NAS/
21:59 🔗 thelsdj so yeah looks like git pull fails with the git in git-annex download, but works fine with the git I have installed on my device
22:00 🔗 thelsdj does git annex require a minimum git version? i think the one i have on the device is kind of old
22:01 🔗 iabak-reg 03registrar 05master 200ee77 06other 10SHARD14/pubkeys registration of fusl on SHARD14
22:01 🔗 thelsdj looks like 2.5 is the latest available built for drobo, i have 2.2 installed right now
22:03 🔗 iabak-reg 03registrar 05master 202a210 06other 10SHARD16/pubkeys registration of thelsdj on SHARD16
22:04 🔗 thelsdj welp, its doing something
22:07 🔗 Blackout closure, google cloud storage != google drive
22:08 🔗 Blackout But yeah I see the page for it so I'll probably look into it
22:10 🔗 thelsdj ugh, i forgot that Drobo doesn't report actual free space with 'df' so I have to like say 'save 8TB' when I only want it to save 2
22:22 🔗 iabak-reg 03registrar 05master 2b7770d 06other 10SHARD4/pubkeys registration of mitch on SHARD4
22:32 🔗 n00b473 has quit IRC (Quit: http://chat.efnet.org )
23:13 🔗 godane here is a non-iabak option for people: http://pastebin.com/Hzz56QDe
23:14 🔗 godane its code to grab collections items
23:14 🔗 godane also i only grabs the original files
23:19 🔗 yipdw Blackout: each shard is many sets of collection/item/files, where the files are as-they-are on IA; so, no, they aren't aggregated or otherwise transformed
23:19 🔗 yipdw that said, we do limit the size of each shard
23:28 🔗 Start has joined #internetarchive.bak
23:39 🔗 Blackout ack
23:52 🔗 sevs has quit IRC (Quit: Page closed)

irclogger-viewer