[00:03] *** odemg has joined #internetarchive.bak [01:08] *** bwn has quit IRC (Read error: Connection reset by peer) [01:20] *** bwn has joined #internetarchive.bak [07:58] *** atomotic has joined #internetarchive.bak [08:25] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [08:57] *** atomotic has joined #internetarchive.bak [10:13] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [10:22] *** odemg has quit IRC (Quit: Leaving) [10:30] *** odemg has joined #internetarchive.bak [12:03] *** zhongfu has quit IRC (Ping timeout: 260 seconds) [12:28] *** atomotic has joined #internetarchive.bak [12:38] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com) [14:14] *** medowar has joined #internetarchive.bak [14:27] *** atomotic has joined #internetarchive.bak [14:55] *** sunny256 has joined #internetarchive.bak [15:04] It seems as some of the shards need a git gc or something. I tried to clone shard5 yesterday, it took more than 5 hours, where the "counting objects" used several hours. shard6 was still not finished after 7 hours and almost not moving forward, so I interrupted it. [15:09] It worked when fetching on top on some old clones, though. Are there any places to download a "pre-populated" Git repo as a .tar.gz or bundle? [15:10] It will reduce the server strain if it doesn't have to compress all the initial Git objects. [15:13] FWIW, I'm pushing shard 1-9 to gitlab.com/iabak/ so people can also clone from there. [15:17] Maybe that could be something to implement in checkoutshard. When cloning the shard for the first time, first rsync a pretty recent bundle with most of the Git objects, init the local shard, fetch from the bundle and then fetch the rest of the objects from the server. [15:21] I can create a patch for it if it sounds like a viable idea. [15:56] *** kyan has quit IRC (Read error: Operation timed out) [16:09] sunny256: it might be worth investigating. the shards a gc'd regularly, but the servers are overloaded and it takes ages [16:11] *** atomotic has quit IRC (Quit: My MacBook has gone to sleep. ZZZzzz…) [16:14] *** atomotic has joined #internetarchive.bak [16:16] db48x: I can have a look at it during the weekend and create a pull request. For now I can create some initial bundles on my server until there are some at archiveteam.org or somewhere else. [16:18] that works [16:18] putting the shards on an external host is actually a pretty interesting idea [16:18] might be simpler than dealing with bundles [16:19] it'd be trivial to have the first clone come from gitlab, and then have it change the remote url to point to the iabak server for regular updates [16:26] I'm trying a test clone from git@gitlab.com:iabak/shard1.git now to see how it copes with it, seems to go well. The reason I mentioned bundles, is to avoid the "counting objects" and "compressing objects", but it seems as it works well. [16:28] It could also be an idea to add an option to checkoutshard where a local repo or bundle can be specified to get the initial objects from. I'll see how it ends up. [16:30] I had to cancel the gitlab pushes, btw. The "resolving deltas" phase would've taken more than three hours. o.O Maybe I'll send the Gitlab team an email first to ask if it's ok to host the shards there. [16:36] heh [16:56] *** atomotic has quit IRC (Quit: Textual IRC Client: www.textualapp.com)