[02:02] *** kyan has quit IRC (Quit: Leaving) [02:08] *** kyan has joined #internetarchive.bak [03:27] *** Start has quit IRC (Read error: Connection reset by peer) [03:28] *** Start has joined #internetarchive.bak [04:07] *** ersi has quit IRC (Read error: Operation timed out) [04:57] *** ersi has joined #internetarchive.bak [04:57] *** svchfoo3 sets mode: +o ersi [05:19] *** pikhq has quit IRC (Ping timeout: 506 seconds) [05:56] *** pikhq has joined #internetarchive.bak [05:56] *** svchfoo2 sets mode: +o pikhq [06:01] *** zottelbey has joined #internetarchive.bak [06:25] *** kyan has quit IRC (Remote host closed the connection) [06:48] *** niyaje4 has joined #internetarchive.bak [07:38] *** kyan has joined #internetarchive.bak [08:07] *** niyaje4 has quit IRC (Ping timeout: 600 seconds) [08:15] I have a bunch of (small) files in shard 1 that fail fsck even after I've redownloaded them: http://pastebin.ubuntu.com/10855023/ [09:01] *** kyan has quit IRC (Quit: Leaving) [10:55] *** niyaje4 has joined #internetarchive.bak [11:26] *** niyaje4 has quit IRC (Ping timeout: 600 seconds) [12:20] *** atomotic has joined #internetarchive.bak [12:39] *** atomotic has quit IRC (Quit: My Mac has gone to sleep. ZZZzzz…) [13:37] *** sankin has joined #internetarchive.bak [13:50] *** Start has quit IRC (Disconnected.) [13:50] *** Start has joined #internetarchive.bak [13:50] *** Start has quit IRC (Client Quit) [14:28] Senji: The XML files were generated after the MD5's. So their MD5's in the census were wrong [14:29] Yes, closure worked around that issue [14:29] I have a *lot* more xml files than that [14:30] oh :( [14:30] 12544 of them apparently :) [14:37] *** Start has joined #internetarchive.bak [14:52] *** sep332 has left [14:52] *** sep332 has joined #internetarchive.bak [14:53] *** svchfoo3 sets mode: +o sep332 [14:57] *** Start has quit IRC (Disconnected.) [15:01] *** Start has joined #internetarchive.bak [15:04] *** Start has quit IRC (Read error: Connection reset by peer) [15:20] *** toad1 has joined #internetarchive.bak [15:26] *** toad2 has quit IRC (Read error: Operation timed out) [15:40] Senji: all of them are xml files [15:40] ? [15:40] hmm, this could be metadata of items getting updated between the census and the download [15:40] 20 xml files and one torrent file [15:41] I suppose the problem is it'll fsck and then re-download, repeatedy [15:42] closure: that's what it seems to be doing. It only takes about 5 mins to redownload the files so it's not a major problem; but I assume the underlying problem might be? [15:43] I thought you said you had 12k such files? [15:43] No, I have 12k xml files that *don't* have the problem [15:43] aah [15:45] ok guys [15:45] anyone successfully got a shard on OSX? [15:46] realeyes: someone tried and there was an issue, which I think we have fixed now [15:46] closure i think it was me lol [15:46] im willing to try again, with instructions [15:46] ah, ok.. well, I think we fixed your issue :) [15:57] Closure, closure! How go [15:59] *** Start has joined #internetarchive.bak [16:00] *** Start has quit IRC (Remote host closed the connection) [16:01] *** Start has joined #internetarchive.bak [16:02] http://hastebin.com/inusudedok.coffee closure [16:02] *** Start has quit IRC (Client Quit) [16:02] readlink: illegal option -- f [16:04] SketchCow: think we're learning some things [16:06] *** Start has joined #internetarchive.bak [16:10] SketchCow: we badly need a user registration/contact system. Need to find someone to build that [16:11] at the moment there is 1tb of data that *someone* has backed up on shard1, but we don't know who and they're not active. [16:11] could have formatted the drive, could have fallen off irc. no way to know [16:11] (actually 3 or 4 someones combined) [16:12] How do you know that you don't know? Err. if you see what I mean? [16:13] rumsfield rumsfield rumsfield [16:13] Heh :) [16:14] closure: 400GB of that is mine. i haven't been using that directory for ages [16:14] which one is that? [16:15] sean@librarian:/mnt/shelf02/archive/shard1 [16:15] ed5cad48-b5a9-46f4-8dcd-9bd446828116 [16:16] oh, that's only 400 mb [16:16] oh lol [16:16] well it's taking up 1.2GB on my drive, guess it only checked in like once [16:16] I assumed most of those small ones were indeed thing people had tried out and given up on. we'll expire and redownload them [16:17] sep332: maybe you should run iabak in it? [16:17] it's ok, i've got 2.1TB of shard1 elsewhere [16:17] a lot of this is just shard1 being special [16:17] and shard1 seems to be in good shape for redundancy, i'll just get rid of it [16:17] http://iabak.archiveteam.org/stats/SHARD2.expireleaderboard <-- much happier [16:18] they've all checked in at least once, we need to get the cron job going, but [16:20] I've not set up a cronjob because both my hosts are still downloading :) [16:21] we've learned that maintenance mode is harder than active download mode :) [16:22] The funny dip in the graph on shard1 last night was me git annex moving my shard1 from the machine that takes hours to fsck to the one that takes minutes [16:22] (leaving only shard2 on the slow machine, so fsck overhead there should be a lot less) [16:22] my readlink isn't working :/ [16:22] I could probably do SSHFS to my digitalocean centOS droplet [16:23] then i wouldn't be having these OSX errors [16:37] ------------------------------ [16:38] Does anyone want to help make a registration [16:38] and contact system for the IA Backup? [16:38] Feel free to use off-the-shelf items, as you [16:38] feel you need to - and making it work [16:38] on a AWS instance or something is fine [16:38] as well [16:38] ------------------------------ [16:43] *** Start has quit IRC (Disconnected.) [16:54] *** Start has joined #internetarchive.bak [17:17] I'll wait a day to see if anyone wants it, then I'll open it more aggressively. [17:17] obviously, OBVIOUSLY, I think we shouldn't be adding more shards until we work it out, since we think expiring may happen [17:26] *** kyan has joined #internetarchive.bak [17:27] yeah [17:28] and make something for OSX users to DL the shards too! lol [17:42] *** Start has quit IRC (Disconnected.) [18:02] *** balrog has quit IRC (Read error: Operation timed out) [18:19] *** balrog has joined #internetarchive.bak [18:46] *** Start has joined #internetarchive.bak [19:00] *** Start has quit IRC (Disconnected.) [19:32] *** kyan has quit IRC (Quit: Leaving) [19:33] *** Start has joined #internetarchive.bak [20:22] *** zottelbey has quit IRC (Remote host closed the connection) [20:22] *** Start has quit IRC (Disconnected.) [20:29] *** Start has joined #internetarchive.bak [20:57] *** sankin has quit IRC (Leaving.) [21:19] *** Start has quit IRC (Disconnected.) [21:50] *** kyan has joined #internetarchive.bak [22:30] *** ersi has quit IRC (Ping timeout: 240 seconds) [22:41] *** Start has joined #internetarchive.bak [22:42] *** Start-mob has joined #internetarchive.bak [22:44] *** ersi has joined #internetarchive.bak [22:44] *** svchfoo2 sets mode: +o ersi [22:51] *** Start-mob has quit IRC (Remote host closed the connection) [23:57] *** Ctrl-S has quit IRC (Read error: Operation timed out)