了解git gc --auto [英] Understanding git gc --auto

查看:451
本文介绍了了解git gc --auto的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在Git中尝试相当积极的auto gc,主要是为了打包目的。在我的回购协议中,如果我做了 git config --list 我已经安装了

  ... 
gc.auto = 250
gc.autopacklimit = 30
...

如果我做 git count-objects -v 我得到

  count:376 
size:1251
in-pack:2776
packs:1
size-pack:2697
prune-packable:0
垃圾:0

但是 git gc --auto 不会改变这些数字,没有任何东西正在被打包!不应该松散的对象被打包,因为我是gc.auto限制的126个对象?

gc --auto 的要点是它应该非常快,所以其他命令经常可以称它为以防万一。为了达到这个目的,对象数量只能被猜测出来。作为 git help config gc.auto 中说:


如果大约超过存储库中的许多松散对象[...]


查看 buildin / gc.c 中的代码( too_many_loose_objects()),会发生什么情况:


  1. gc.auto除以256并四舍五入
  2. 包含所有以 17 开头的对象被打开

  3. 检查文件夹是否包含比第1步结果更多的对象

这很好,因为SHA-1是均匀分布的,所以所有以X开头的对象代表整个集合。但是,这当然只适用于大量的物体。懒惰做数学,我猜想至少> 3000。使用6700(默认值 gc.auto ),这应该已经非常可靠了。



核心问题对我来说就是为什么你需要这么低的设置,以及它是否真的在250个物体上运行是重要的。设置为250时,只要有2个以 17 开头的松散对象,就会运行 gc 。发生这种情况的可能性是> 80%为600个对象,> 90%为800个对象。



更新:无法提供帮助 - 必须做数学运算:) 。我很想知道这个评估系统的效果如何。这是一个结果图。对于任何给定 gc.auto ,当 gc 开始的概率有多高> gc.auto (红色)/ gc.auto * 1.1 (绿色)/ gc.auto * 1.2 (orange)/ gc.auto * 1.5 (蓝色)/ gc.auto * 2 (紫色)在回购松散对象?




I'm experimenting with fairly aggressive auto gc in Git, mainly for packing purposes. In my repos if I do git config --list I have setup

...
gc.auto=250
gc.autopacklimit=30
...

If I do git count-objects -v I get

count: 376
size: 1251
in-pack: 2776
packs: 1
size-pack: 2697
prune-packable: 0
garbage: 0

But git gc --auto doesn't change these figures, nothing is being packed! shouldn't the loose objects get packed since I'm 126 objects over the gc.auto limit?

解决方案

One of the main points of gc --auto is that it should be very quick, so other commands can frequently call it "just in case". To achieve that, the object count is only guessed. As git help config says under gc.auto:

When there are approximately more than this many loose objects in the repository […]

Looking at the code (too_many_loose_objects() in buildin/gc.c), here’s what happens:

  1. The gc.auto is divided by 256 and rounded up
  2. The folder that contains all the objects that start with 17 is opened
  3. It is checked if the folder contains more objects than the result of step 1

This works fine, since SHA-1 is evenly distributed, so "all the objects that start with X" is representative for the whole set. But of course this only works for a big big amount of objects. To lazy to do the maths, I would guess at least >3000. With 6700 (the default value of gc.auto), this should already work quite reliably.

The core question for me is why you need such a low setting and whether it is important that this really runs at 250 objects. With a setting of 250, gc will run as soon as you have 2 loose objects that start with 17. The chance that this happens is > 80% for 600 objects and > 90% for 800 objects.

Update: Couldn’t help it – had to do the math :). I was wondering how well that estimation system would work. Here’s a plot of the results. For any given gc.auto, how high is the probability that gc will start when there are gc.auto (red) / gc.auto * 1.1 (green) / gc.auto * 1.2 (orange) / gc.auto * 1.5 (blue) / gc.auto * 2 (purple) loose objects in the repo?

这篇关于了解git gc --auto的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆