什么是git的瘦身包? [英] What are git's thin packs?

查看:154
本文介绍了什么是git的瘦身包?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我没有在薄包上找到太多东西,而且手册页的信息对此很含糊。我知道这与缓慢的连接有关,但是什么是慢速连接?

它有什么优点和缺点?我应该什么时候使用它,何时不应该使用它?为解决这个问题,手册页( index-pack 指出:


git-pack-objects 构建瘦包,该包基于未包含在包中的对象以已定义的格式记录对象以减少网络流量
这些对象预期为出现在接收端,并且它们必须包含在该包中以便自包含并且可以索引。

完成的核心/ docs / git-push.htmlrel =nofollow noreferrer> git push 手册页 - thin 选项:


精简转移花费额外的周期来减少要发送的对象数量,连接速度较慢

因此,在这种情况下,慢速网络是您希望尽可能发送最低数据量的连接。




此线程 JakubNarębski解释了更多(在使用上下文中git gc在远程端以及本地端):


Git只在包文件中定义
但是当你通过SSH推送时,git会生成提供另一方没有提交的包文件,这些包是薄包,所以他们也有增量... ...
,但远程端为这些薄包增加了基础,使他们独立。


更确切地说:


开本地方:

git-commit 创建宽松(压缩但未分离)对象。 git-gc 打包并确认。



在远程端(对于智能协议,即git和ssh):

git创建 thin 包,分离;

在远程端git通过添加基础对象(object + deltas)或将包打包成松散对象(对象)。

您需要远程服务器上的git-gc才能在远程端完全解析。但是,转帐完全是


在远程方面(对于哑协议,即rsync和http):

git认为必需打包并将它们全部传输。

所以情况就像在本地一样,但git可能会传输超过真正需要的内容,因为它会全面传输包。







上面的问题与 git push --thin :你什么时候使用它?

如果你想让git利用这些瘦数据包,你需要小心地管理你的二进制对象:



  1. 通过复制旧的文件名来创建新文件名(因此使用旧的blob)
  2. 提交

  3. PUSH

  4. 复制真实新文件 li> PUSH。

如果您在步骤3中省略了中间PUSH,那么 git push ,也不是 git push --th在>
可以认识到,这个新文件可以在远程端增量构建(即使git-gc完全压缩它在包中)。

实际上,薄包的工作方式是将delta与一个未包含在包中的基础对象进行存储。

不包含但用作delta base的对象目前只是文件的以前版本,它是更新的一部分被推入/取出。

换句话说,必须有一个同名的以前版本才能工作。

如果以前的提交有数千个文件需要测试,那么做其他操作时不会扩展。



这些精简包是为同一文件的不同版本设计的介意,而不是具有几乎相同内容的不同文件。问题是要决定添加到对象列表的首选增量基数。目前只有与被修改路径相同的对象才被考虑。



I haven't found much on thin packs, and the man pages' information is rather cryptic about this. I know it has something to do with slow connections, but what would be a "slow connection"?

What are its pros and cons? When should I use it, when should I not use it?

解决方案

For the record, the man page (index-pack) states:

It is possible for git-pack-objects to build "thin" pack, which records objects in deltified form based on objects not included in the pack to reduce network traffic.
Those objects are expected to be present on the receiving end and they must be included in the pack for that pack to be self contained and indexable.

That would complete the git push man page of the --thin option:

Thin transfer spends extra cycles to minimize the number of objects to be sent and meant to be used on slower connection

So a "slow network" in this case is a connection where you want to send the lowest amount of data as possible.


In this thread, Jakub Narębski explains a bit more (in the context on using git gc on the remote side as well as on the local side):

Git does deltification only in packfiles.
But when you push via SSH, git would generate a pack file with commits the other side doesn't have, and those packs are thin packs, so they also have deltas...
but the remote side then adds bases to those thin packs making them standalone.

More precisely:

On the local side:
git-commit creates loose (compressed, but not deltified) objects. git-gc packs and deltifies.

On the remote side (for smart protocols, i.e. git and ssh):
git creates thin pack, deltified;
on the remote side git either makes pack thick/self contained by adding base objects (object + deltas), or explodes pack into loose object (object).
You need git-gc on remote server to fully deltify on remote side. But transfer is fully deltified.

On the remote side (for dumb protocols, i.e. rsync and http):
git finds required packs and transfers them whole.
So the situation is like on local side, but git might transfer more than really needed because it transfers packs in full.


The problem above was related to the use (or non-use) of git push --thin: when do you use it or not?
Turns out you do need to carefully manage your binary objects if you want git to take advantage of those thin packets:

  1. Create the new filename by just copying the old (so the old blob is used)
  2. commit
  3. PUSH
  4. copy the real new file
  5. commit
  6. PUSH.

If you omit the middle PUSH in step 3, neither "git push", nor "git push --thin" can realize that this new file can be "incrementally built" on the remote side (even though git-gc totally squashes it in the pack).

In fact, the way thin packs work is to store delta against a base object which is not included in the pack.
Those objects which are not included but used as delta base are currently only the previous version of a file which is part of the update to be pushed/fetched.
In other words, there must be a previous version under the same name for this to work.
Doing otherwise wouldn't scale if the previous commit had thousands of files to test against.

Those thin packs were designed for different versions of the same file in mind, not different files with almost the same content. The issue is to decide what preferred delta base to add to the list of objects. Currently only objects with the same path as those being modified are considered.

这篇关于什么是git的瘦身包?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆