[00:40] *** X-Scale has quit (Ping timeout: 240 seconds) [00:41] *** X-Scale (~gbabios@[redacted]) has joined #internetarchive.bak [01:23] OK, great news. [01:23] https://archive.org/details/ia-bak-census_20150304 now has a proper, public list [01:38] great picture :) [01:59] *** X-Scale has quit (Ping timeout: 240 seconds) [02:09] *** X-Scale (~gbabios@[redacted]) has joined #internetarchive.bak [02:29] What is the lower end of intended users of this? [02:39] ? [02:39] Poor people? [02:56] ex. I have a capped ADSL connection but a few TB of unused disk space [02:57] Would i be able to specify how many GB/time to download for this? [03:07] No. [03:07] You're in the wrong situation for this. [03:08] okay [03:19] *** zottelbey has quit (Remote host closed the connection) [04:19] *** thunk has quit (http://www.kiwiirc.com/ - A hand crafted IRC client) [04:21] *** thunk (4746deec@[redacted]) has joined #internetarchive.bak [04:33] *** thunk has quit (http://www.kiwiirc.com/ - A hand crafted IRC client) [05:25] actually [05:25] you could participate [05:26] but you would have to do some things manually [05:26] when you clone a git annex repository, you get all of the metadata for all of the files in that repository, but not the files themselves [05:27] you can then request any of those files at any time, and they'll be downloaded from any other available source (which in this case will be usually IA) [05:28] the automated thing we're doing would start you out with a list of files that you should go ahead and download, which git annex will then do [05:29] but you can request files manually to use up bandwidth as you like instead [05:30] of course, with some work on git annex we could probably tweak it to respect bandwidth caps automatically, but I don't think it does at the moment [05:51] *** thunk (4746deec@[redacted]) has joined #internetarchive.bak [06:19] I don't think i have the experience needed to make it practiacl to to that [06:19] If you guys do end up making something us non-experts can use, I'd be happy to try it out [06:20] thanks for the info anyway [06:39] *** thunk has quit (http://www.kiwiirc.com/ - A hand crafted IRC client) [06:58] *** csssuf has quit (Read error: Connection reset by peer) [07:03] *** csssuf (~csssuf@[redacted]) has joined #internetarchive.bak [07:08] Ctrl-S: fair enough :) [07:17] *** Sanqui has quit (Ping timeout: 370 seconds) [07:17] *** csssuf has quit (Ping timeout: 370 seconds) [07:18] *** csssuf (~csssuf@[redacted]) has joined #internetarchive.bak [07:19] *** Sanqui (~Sanky_R@[redacted]) has joined #internetarchive.bak [07:53] *** mrfoo (sid25914@[redacted]) has joined #internetarchive.bak [09:09] *** bzc6p_ (~bzc6p@[redacted]) has joined #internetarchive.bak [09:16] *** bzc6p has quit (Read error: Operation timed out) [10:23] *** db48x has quit (Read error: Connection reset by peer) [10:37] *** bzc6p_ is now known as bzc6p [12:00] question: dark items. how do we know these items stay dark as in, people dont see them on their own system? [12:01] nevermind, it's discussed in the talk. [12:39] *** thunk (4746deec@[redacted]) has joined #internetarchive.bak [12:40] *** thunk has quit (Client Quit) [12:51] *** thunk (4746deec@[redacted]) has joined #internetarchive.bak [13:03] *** zottelbey (~zottelbey@[redacted]) has joined #internetarchive.bak [13:03] *** thunk has quit (http://www.kiwiirc.com/ - A hand crafted IRC client) [14:30] *** Start has quit (Disconnected.) [15:04] *** Start (~Start@[redacted]) has joined #internetarchive.bak [15:05] *** Start has quit (Read error: Connection reset by peer) [15:05] *** Start_ (~Start@[redacted]) has joined #internetarchive.bak [15:51] *** Start_ has quit (Disconnected.) [15:57] *** Start (~Start@[redacted]) has joined #internetarchive.bak [15:58] *** Start has quit (Read error: Connection reset by peer) [15:58] *** Start_ (~Start@[redacted]) has joined #internetarchive.bak [15:58] *** Start_ has quit (Client Quit) [16:01] *** Start (~Start@[redacted]) has joined #internetarchive.bak [16:28] *** garyrh has quit (Remote host closed the connection) [16:45] *** Start has quit (Disconnected.) [16:50] The key is a tiered approach [16:51] I am desperately doing "physical" things this week so I haven't sat down to update the wiki. [16:52] *** garyrh (garyrh@[redacted]) has joined #internetarchive.bak [16:52] *** svchfoo2 gives channel operator status to garyrh [16:56] *** Start (~Start@[redacted]) has joined #internetarchive.bak [17:37] *** VADemon (~VADemon@[redacted]) has joined #internetarchive.bak [17:42] *** Start has quit (Disconnected.) [18:17] there are 380,402 files in the census that are just the word "done" in a file. what's with that? [18:38] oh, every item in the wikimediadownloads collection has a status.txt that just says "done" lol [18:43] *** Start (~Start@[redacted]) has joined #internetarchive.bak [18:44] *** Start has quit (Read error: Connection reset by peer) [18:44] *** Start_ (~Start@[redacted]) has joined #internetarchive.bak [18:55] *** X-Scale has quit (Remote host closed the connection) [18:55] *** X-Scale (~gbabios@[redacted]) has joined #internetarchive.bak [18:59] *** zottelbey has quit (Remote host closed the connection) [19:03] *** zottelbey (~zottelbey@[redacted]) has joined #internetarchive.bak [19:03] *** Start_ has quit (Disconnected.) [19:03] *** thunk (4746deec@[redacted]) has joined #internetarchive.bak [19:04] *** thunk has quit (Client Quit) [19:15] *** Start (~Start@[redacted]) has joined #internetarchive.bak [19:26] *** Start has quit (Disconnected.) [19:31] *** Start (~Start@[redacted]) has joined #internetarchive.bak [19:32] *** Start has quit (Read error: Connection reset by peer) [19:32] *** Start_ (~Start@[redacted]) has joined #internetarchive.bak [19:54] The census is very informative, no [19:57] it's pretty exiting. i've never been motivated to leark awk before haha [19:57] It would help me if you updated the wiki to describe what you find, including interesting things. [19:57] Make a new item, Internet Archive Census [19:57] Or I can [19:58] yeah i'm not that familiar with the wiki [19:58] i do have some random observations that are cool but aren't really related to the backup project [19:58] I want them regardless [20:00] i mean on the INTERNETARCHIVE.BAK page or another one? [20:00] I'm doing it, relax [20:00] I'll link you in a moment. [20:02] ok :) [20:13] http://archiveteam.org/index.php?title=Internet_Archive_Census [20:13] Needs more updating., [20:20] *** Start_ has quit (Disconnected.) [21:09] *** bzc6p_ (~bzc6p@[redacted]) has joined #internetarchive.bak [21:16] *** bzc6p has quit (Ping timeout: 600 seconds) [21:34] *** bzc6p_ is now known as bzc6p [21:44] *** thunk (4746deec@[redacted]) has joined #internetarchive.bak [22:30] *** Start (~Start@[redacted]) has joined #internetarchive.bak [22:31] *** svchfoo2 gives channel operator status to Start