git-rebase如何识别“别名"?提交? [英] How does git-rebase recognize "aliased" commits?

查看:92
本文介绍了git-rebase如何识别“别名"?提交?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图更好地理解git-rebase背后的魔力.今天,我对以下未曾料到的以下行为感到非常惊喜.

TLDR:我重新建立了一个共享分支的基础,导致所有提交sha1都更改了.尽管如此,派生分支仍能够准确地识别出其原始提交已被混淆"为具有不同sha1的新提交.重新部署根本不会造成任何混乱.

详细信息

进入主分支:M1

将其分支到分支X中,并添加了一些其他提交:M1-A1-B1-C1. 记下git-log输出.

将分支X分支到分支Y,并添加一个附加提交:M1-A1-B1-C1-D1.记下git-log输出.

将新的提交添加到master分支的尖端:M1-M2

将branch-X重新建立到已更新的主服务器上:M1-M2-A2-B2-C2.请注意,A2-B2-C2的消息,内容和作者日期都与A1-B1-C1相同.但是,它们具有完全不同的sha1值以及提交日期.根据此文章,SHA1不同的原因是提交的父级已更改. /p>

将分支Y重新设置到更新的分支X上.结果:M1-M2-A2-B2-C2-D2.

值得注意的是,仅应用了D1提交(并变为D2). git-rebase完全忽略了分支Y中的A1-B1-C1提交.您可以在输出日志中看到这一点.

这太好了,但是git-rebase如何知道忽略A1-B1-C1? git-rebase如何知道A2-B2-C2与A1-B1-C1相同,因此可以安全地忽略它?我一直以为git使用sha1标识符跟踪提交,但是尽管上述提交具有不同的sha1,但是git仍然以某种方式知道它们链接在一起.它是如何做到的?鉴于上述行为,何时重新设置确实有真正的危险共享分支?

解决方案

在内部,git rebase列出应重新建立基础的提交,然后计算 https://git-scm.com/book/en/v2/Git-Branching-Rebasing


上方相关部分(缺少图):

如果团队中的某人进行更改以覆盖您所基于的工作,那么您面临的挑战就是弄清楚您的身份以及他们的重写内容.

事实证明,除了提交SHA-1校验和之外, Git还基于提交所引入的补丁来计算校验和.这称为补丁程序ID".

如果您撤下重写的工作并将其重新建立在合作伙伴的新提交之上, Git通常可以成功地找出您的独特之处,并将其重新应用到新分支的顶部.

例如,在以前的场景中,如果不是在有人在时进行合并而推送重新提交的基础,而是在运行git rebase teamone/master的基础上放弃了基于您的工作的提交,则Git将:

  • 确定什么是我们分支机构(C2,C3,C4,C6,C7)独有的工作
  • 确定哪些不是合并提交(C2,C3,C4)
  • 确定尚未重写到目标分支的对象(仅C2和C3,因为C4与C4'是同一补丁)
  • 将这些提交应用于组队/主控的顶部

这仅在您的伴侣制作的C4和C4'几乎完全相同的补丁时才有效.否则,重新定位将无法分辨它是重复的,并且将添加另一个类似C4的补丁(由于更改已经至少存在,因此补丁可能无法完全应用).

I'm trying to better understand the magic behind git-rebase. I was very pleasantly surprised today by the following behavior, which I didn't expect.

TLDR: I rebased a shared branch, causing all commit sha1s to change. Despite this, a derived branch was able to accurately identify that its original commits were "aliased" into new commits with different sha1s. The rebase didn't create any mess at all.

Details

Take a master branch: M1

Branch it off into branch-X, with some additional commits added: M1-A1-B1-C1. Note down the git-log output.

Branch off branch-X into branch-Y, with one additional commit added: M1-A1-B1-C1-D1. Note down the git-log output.

Add a new commit to the tip of the master branch: M1-M2

Rebase branch-X onto the updated master: M1-M2-A2-B2-C2. Note that A2-B2-C2, all have the same message, contents and author-date as A1-B1-C1. However, they have completely different sha1 values, as well as commit dates. According to this writeup, the reason the SHA1 is different is because the commit's parent has changed.

Rebase branch-Y onto the updated branch-X. Result: M1-M2-A2-B2-C2-D2.

Notably only the D1 commit is applied (and becomes D2). The A1-B1-C1 commits in branch-Y are completely ignored by git-rebase. You can see this in the output logs.

This is wonderful, but how does git-rebase know to ignore A1-B1-C1? How does git-rebase know that A2-B2-C2 are the same as A1-B1-C1, and hence, can be safely ignored? I had always assumed that git keeps track of commits using the sha1 identifier, but despite the above commits having different sha1s, git still somehow knows that they are linked together. How does it do that? Given the above behavior, when is it truly dangerous to rebase a shared branch?

解决方案

Internally, git rebase lists commits that should be rebased, and then computes a patch-id for these commits. Unlike the commit id, it only hashes the content of the patch, not the content of the tree and commit objects. So, A1 and A2, while having different identifiers, have the same patch-id. Then, git rebase skips patches whose patch-id is already present.

For more information, search patch-id here: https://git-scm.com/book/en/v2/Git-Branching-Rebasing


Relevant section from above (diagrams missing):

If someone on your team force pushes changes that overwrite work that you’ve based work on, your challenge is to figure out what is yours and what they’ve rewritten.

It turns out that in addition to the commit SHA-1 checksum, Git also calculates a checksum that is based just on the patch introduced with the commit. This is called a "patch-id".

If you pull down work that was rewritten and rebase it on top of the new commits from your partner, Git can often successfully figure out what is uniquely yours and apply them back on top of the new branch.

For instance, in the previous scenario, if instead of doing a merge when we’re at Someone pushes rebased commits, abandoning commits you’ve based your work on we run git rebase teamone/master, Git will:

  • Determine what work is unique to our branch (C2, C3, C4, C6, C7)
  • Determine which are not merge commits (C2, C3, C4)
  • Determine which have not been rewritten into the target branch (just C2 and C3, since C4 is the same patch as C4')
  • Apply those commits to the top of teamone/master

This only works if C4 and C4' that your partner made are almost exactly the same patch. Otherwise the rebase won’t be able to tell that it’s a duplicate and will add another C4-like patch (which will probably fail to apply cleanly, since the changes would already be at least somewhat there).

这篇关于git-rebase如何识别“别名"?提交?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆