如何在git中安全地使用共享对象数据库? [英] How can one safely use a shared object database in git?
问题描述
我已经在几处地方读过,可以在多个git存储库之间共享 objects
目录,例如与符号链接。我想这样做是为了在同一个目录下的几个裸存储库之间共享对象数据库:
shared-objects-database /
foo.git /
objects - > ../shared-objects-database
bar.git /
objects - > ../shared-objects-database
baz.git /
objects - > ../shared-objects-database
(我这样做是因为会有很多)
我担心的是,当使用这些存储库时, git gc
会自动调用,并导致从一个存储库无法访问的对象被修剪,使其他存储库不完整。有没有简单的方法可以确保不会发生?例如,是否有一个配置选项会强制 - no-prune
作为 git gc
的默认值,如果是这样的话,是否足以使用这个设置而不会丢失数据?
目前,我一直使用对象/ info / alternates
机制来共享这些存储库之间的对象,但是将这些指针从每个存储库维护到所有其他存储库是有点难以理解的。
(我的另一种选择是只有一个裸仓库,所有分支 foo.git
, bar.git
和 baz.git
命名为 foo-master
, foo-testing
, bar-master
等等。但是,要管理更多的工作,所以如果符号链接的对象目录可以安全地工作,我宁愿)
你可能会猜到这是使用Git的原因之一,但我希望这个问题很清楚并且有效。 )为什么不简单地将 gc.pruneExpire
变量调整到<$ c>
git gc
在那个版本中是相当安全的,因为它真的知道什么是不可访问的。 编辑:好的,我对时间限制有点傲慢;正如在评论中指出的那样,1000年的工作不会太好,但是这个时代的开始会是,或者从来没有
。
I have read in several places that it's possible to share the objects
directory between multiple git repositories, e.g. with symbolic links. I would like to do this to share the object databases between several bare repositories in the same directory:
shared-objects-database/
foo.git/
objects -> ../shared-objects-database
bar.git/
objects -> ../shared-objects-database
baz.git/
objects -> ../shared-objects-database
(I'm doing this because there are going to be lots of large blobs redundantly stored in each objects directory otherwise.)
My concern about this is that when using these repositories, git gc
will be called automatically and cause objects which are unreachable from one repository to be pruned, making the other repositories incomplete. Is there any easy way of ensuring that this doesn't happen? For example, is there a config option that would force --no-prune
to be the default for git gc
, and, if so, would that be sufficient to use this setup without risking losing data?
At the moment, I've been using the objects/info/alternates
mechanism to share objects between these repositories, but maintaining these pointers from each repository to all the others is a bit hacky.
(My other alternative is to just to have a single bare repository, with all the branches of foo.git
, bar.git
and baz.git
named foo-master
, foo-testing
, bar-master
, etc. However, that'd be a bit more work to manage, so if the symlinked objects directory can work safely, I'd rather do that.)
You might guess that this is one of those Using Git For What It Was Not Intended use cases, but I hope the question is clear and valid nonetheless ;)
Why not just crank the gc.pruneExpire
variable up to never
? It's unlikely you'll ever have loose objects 1000 years old that you don't want deleted.
To make sure that the things which really should be pruned do get pruned, you can keep one repo which has all the others as remotes. git gc
would be quite safe in that one, since it really knows what is unreachable.
Edit: Okay, I was a bit cavalier about the time limit; as is pointed out in the comments, 1000 years isn't gonna work too well, but the beginning of the epoch would, or never
.
这篇关于如何在git中安全地使用共享对象数据库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!