合并后分支的 Git 历史记录 [英] Git history for branch after merge

查看:91
本文介绍了合并后分支的 Git 历史记录的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对合并后 git 如何存储历史记录有点困惑.

我已成功将分支 A 合并到分支 B.现在,当我转到分支 B 中的文件时,这是合并的一部分,我看到分支 A 的该文件的所有历史记录,但我没有看到B 分支的任何历史记录.分支 B 的那个文件的历史记录到哪里去了?

我合并的方式是通过 git merge 所以在这种情况下,我在分支 B 并使用 git merge A>.

<小时>

例如,在分支 A 中,我有以下提交:a, aa, aaa 对应不同的文件.

在分支 B 中,我有以下提交:b, bb, bbb 对应不同的文件.

现在当我将分支A合并到分支B时,我在分支B git log中看到的都是a>、aaaaa 历史.我没有看到我的 b 历史记录.

本质上,我希望我的合并是线性的,当我将 A 合并到 B 时,我希望历史记录包含所有分支 B history 并且在历史之上,它将是刚刚发生的合并,类似于 SVN 的做法.

我当前的 git 日志历史记录非常混乱.

解决方案

TL;DR

历史没有消失,Git 只是没有展示它.

在 Git 中,历史提交的集合.没有文件历史记录!

当您运行像 git log dir/sub/file.ext 这样的命令时,或者就此而言,git log dir/subgit log . 而在 dir/sub 中,Git 将合成一个(临时)文件历史,通过从真实历史中提取一些子历史——提交集.这个合成过程故意丢弃一些提交.例如,它会删除不影响您询问的任何文件的所有 提交.但默认情况下,它会下降很多,通过 git log 调用 History Simplification 的东西.

更长

每个提交都有一个唯一的哈希 ID.例如,您可以在 git log 输出中看到这些.哈希 ID 实际上只是提交内容的加密校验和.

每个提交存储(哈希 ID)一个文件快照——Git 称之为.合并提交也是如此:一个合并提交,就像任何其他提交一样,有一个树.

每个提交还存储您的姓名(作者和提交者)、电子邮件地址和时间戳,以便 Git 可以向您显示这些内容.它存储一条日志消息——不管你给它什么——这样 Git 也可以显示出来.

Git 存储在提交中的最后一件事——实际上,紧接在 tree 之后的第二件事——是 提交的列表,通过它们唯一的哈希 ID.

线性历史很容易

在处理普通的非合并提交时,查看历史非常简单.我们只是从 latest 提交开始,由一些分支名称(如 master)标识,然后向后工作.分支名称包含最后一次提交的哈希 ID——分支的tip——我们说分支名称​​指向那个提交:

... <--1234567... <--master

如果 commit 1234567master 的提示,git log 可以显示你 commit 1234567 ...并且提交 1234567 在其中包含 before 1234567 的提交的哈希 ID.

如果我们将真实的哈希 ID 换成单个字母,为了让事情变得更简单,我们会得到这样的结果:

A <-B <-C <-D <-E <-F <-G <--master

Commit G 指向提交 F,它指向 E,依此类推,直到我们到达第一个提交,commitA.这个提交没有指向任何地方——它不能,这是第一次提交;它不能有父级——所以这是历史结束(开始?)的地方,在时间的开始.Git 将 A 称为 root 提交:没有父级的提交.

显示线性历史很容易,从时间结束开始,到开始结束.Git 一次只挑选一个提交并显示它.就是这样:

git log master

does:它从由 master 标识的一个提交开始,并显示它,然后显示一个提交的一个父级,然后显示之前的一个,依此类推.

当您让 Git 向您展示提交时,您可以——事实上,您几乎总是——让 Git 将其显示为一个补丁,而不是一个快照.例如, git log --patch 就是这样做的.要将提交显示为补丁,Git 只需先查看提交的 树,然后查看提交的树,然后比较两者.由于两者都是快照,无论从父快照更改为子快照,都必须是让子提交的人实际做了什么.

非线性历史更难

既然我们知道 Git 是反向工作的,那么让我们来看看更复杂的历史,包括包含实际合并提交的历史.(让我们不要因为 git merge 并不总是合并的事实而偏离轨道!)

merge commit 就是至少有两个父项的提交.在大多数情况下,你不会看到三个或更多父级的提交——Git 称之为八爪鱼合并,它们不会做任何普通合并不能做的事情,所以八爪鱼合并主要是为了炫耀你的 Git-fu.:-)

我们通常通过执行 git checkout somebranch 来获得合并;git merge otherbranch,我们可以画出这样的提交链:

...--E--F--G------M <-- master/H--I--J <-- 特征

现在,假设您运行 git log master(注意:没有 --patch 选项).Git当然应该首先显示你提交 M .但是接下来 Git 会显示哪个提交呢?J,还是G?如果它显示其中之一,那之后应该显示哪一个?

Git 对这个问题有一个通用的答案:当它向您显示合并提交时,它可以将提交的两个父项添加到尚未显示的提交"队列中.当它向您显示一个普通的非合并提交时,它会将(单个)父项添加到同一队列中.然后它可以循环遍历队列,一次向您显示提交一个,将它们的父项添加到队列中.

当历史是线性的时,队列中一次只有一个提交:一个提交被删除并显示出来,现在队列中有一个父级,你会看到父级.

当历史有合并时,队列从一个提交开始,Git 将提交从队列中弹出并显示出来,并将两个父项放入队列中.然后 Git 选择两个父节点之一并显示 GJ,然后将 FI 放入队列.队列中仍然有两个提交.Git 弹出一个并显示该提交并打开另一个.

F 已经在队列中时,Git 最终会尝试将 F 放入队列中.Git 避免添加两次,因此最终队列深度再次减少到一次提交,在这种情况下显示 FED,等等在.(这里的细节有点复杂:队列具体是一个优先队列 优先级由附加的 git log 排序参数决定,因此有多种不同的方式可以发生这种情况.)

您可以使用 git log --graph

查看连接

如果您将 --graph 添加到您的 git log 命令中,Git 将绘制一个有点粗糙的 ASCII 艺术图,其中的线条将子提交返回到其父提交.这非常有助于告诉您您正在查看的提交历史毕竟不是线性的,即使 git log 一次向您显示一个提交(因为它必须).

显示合并提交

我在上面提到过,使用 -p--patchgit log 将显示更改的内容通过将父快照/树与子快照/树进行比较来提交.但是对于合并提交,有两个(甚至更多)父母:没有办法向您展示父母与孩子的比较,因为至少有两个父母.

git log 在默认情况下所做的就是完全放弃.它根本不显示补丁.其他命令做一些更复杂的事情,你也可以说服 git log 这样做,但让我们注意,默认是 git log 在这里放弃.>

历史简化(这是一个指向 git log 文档的可点击链接)

当您运行 git log file.ext 时,Git 会故意跳过任何非合并提交,其中差异(通过比较父子到子项获得)不触及 file.ext.这很自然:如果你有一个像这样的链:

A--B--C--D--E <-- 主

并且您在提交 AE 时更改(或首次创建)file.ext,您只想看到这两个提交.Git 可以通过为 D-vs-E 找出补丁并查看 file.ext 更改(因此它应该 显示 E),然后转到D.C-vs-D 比较显示 file.ext 没有变化,所以 Git 不会 显示 D,但它会将C放入优先级队列,继续访问C.这也没有对文件进行任何更改,因此 Git 最终移动到 B,它没有变化,并且 Git 移动到 A.为了比较,A 中的所有文件总是新的——这是任何根提交的规则;所有文件都被添加了——所以 Git 也会向你显示 A.

不过,我们刚刚看到,默认情况下 git log 不喜欢为合并计算补丁.太难了!所以 git log 通常不会在这里显示合并.然而,它确实试图简化提交图的任何部分.正如文档所说,默认模式:

<块引用>

如果最终结果相同,则修剪一些侧枝......

如果提交是合并,并且 [文件与] 中的一个父项相同,则仅关注该父项....否则,跟随所有父母.

所以在我们的图表中像 M 这样的合并提交时,Git 会做一个快速检查:file.extM 中是否相同就像在 G 中一样?如果是,则将 G 添加到队列中.如果不是,MJ 是不是一样?如果是,则将 J 添加到队列中.否则——即,file.extM 中不同于 both G and> J——将 GJ 添加到队列中.

历史简化还有其他模式,您可以使用各种标志进行选择.这个答案已经太长了,所以我会把它们留给文档(见上面的链接).

结论

您不能从 git log -- path 显示的内容中得出太多推论,因为 Git 执行了历史简化.如果您想查看所有内容,请考虑运行 git log --full-history -m -p -- path.-m 选项为 git diff 目的拆分每个合并(这与 -p 选项一起使用),并且 --full-history 强制 Git 始终跟随所有父母.

I am a bit confused on how git stores history after merging.

I have merged branch A to branch B successfully. Now, when I go to a file, in branch B, that was part of the merge I see all the history for that file for branch A but I don't see any history for branch B. Where has my history for that file for branch B gone to?

The way I merged was through git merge <branch> so in this case, I was in branch B and used git merge A.


For example, in branch A I had the following commits: a, aa, aaa corresponding to different files.

In branch B, I had the following commits: b, bb, bbb corresponding to different files.

Now when I merged branch A into branch B, all I see in branch B git log are a, aa, aaa history. I don't see my b history.

In essence, I want my merge to be linear, when I merge A to B then I want the history to have all of branch B history and on top of the history it will be the merge that just occurred similar to how SVN does it.

My current git log history is very confusing.

解决方案

TL;DR

The history isn't gone, Git just isn't showing it.

In Git, the history is the set of commits. There is no file history!

When you run a command like git log dir/sub/file.ext, or for that matter, git log dir/sub or git log . while in dir/sub, Git will synthesize a (temporary) file history, by extracting some sub-history from the real history—the set of commits. This synthetic process deliberately drops some commits. For instance, it drops all commits that don't affect any of the files you have asked about. But by default, it drops a lot more than that, via something that git log calls History Simplification.

Longer

Every commit has a unique hash ID. You see these in git log output, for instance. The hash ID is actually just a cryptographic checksum of the commit's content.

Each commit stores (the hash ID of) a snapshot of files—Git calls this a tree. This is true of merge commits as well: a merge commit, like any other commit, has a tree.

Each commit also stores your name (author and committer) and email address and time-stamp, so that Git can show these to you. It stores a log message—whatever you give it—so that Git can show that as well.

The last thing that Git stores in a commit—the second thing, really, right after the tree—is a list of parent commits, by their unique hash IDs.

Linear history is easy

When dealing with ordinary, non-merge commits, it's pretty straightforward to look at the history. We simply start with the latest commit, as identified by some branch name like master, and work backwards. The branch name contains the hash ID of the last commit—the tip of the branch—and we say that the branch name points to that commit:

... <--1234567...   <--master

If commit 1234567 is the tip of master, git log can show you commit 1234567 ... and commit 1234567 has inside it the hash ID of the commit that comes right before 1234567.

If we swap out real hash IDs for single letters, to make things easier, we get something like this:

A <-B <-C <-D <-E <-F <-G   <--master

Commit G points back to commit F, which points back to E, and so on until we reach the very first commit, commit A. This commit does not point anywhere—it can't, it was the first commit; it cannot have a parent—so this is where the history ends (starts?), at the beginning of time. Git calls A a root commit: a commit with no parent.

It's easy to show linear history, starting at the end of time and ending at the start. Git just picks out each commit one at a time and shows it. That's what:

git log master

does: it starts with the one commit identified by master, and shows it, and then shows the one commit's one parent, and then shows the one before that, and so on.

When you have Git show you a commit, you can—in fact, you almost always—have Git show it as a patch, rather than as a snapshot. For instance, git log --patch does this. To show a commit as a patch, Git just looks at the commit's parent's tree first, then at the commit's tree, and compares the two. Since both are snapshots, whatever changed from the parent's snapshot to the child's, must be whatever the person who made the child commit actually did.

Non-linear history is harder

Now that we know that Git works backwards, let's take a look at more complex history, including history that includes an actual merge commit. (Let's not get sidetracked by the fact that git merge does not always merge!)

A merge commit is simply a commit with at least two parents. In most cases you won't see commits with three or more parents—Git calls these octopus merges, and they don't do anything you cannot do with ordinary merges, so octopus merges are mainly for showing off your Git-fu. :-)

We normally get a merge by doing git checkout somebranch; git merge otherbranch, and we can draw the resulting commit chain like this:

...--E--F--G------M   <-- master
                /
          H--I--J   <-- feature

Now, suppose you run git log master (note: no --patch option). Git should of course show you commit M first. But which commit will Git show next? J, or G? If it shows one of those, which one should it show after that?

Git has a general answer to this problem: when it shows you a merge commit, it can add both parents of the commit to a queue of "commits yet to be shown". When it shows you an ordinary non-merge commit, it adds the (single) parent to the same queue. It can then loop through the queue, showing you commits one at a time, adding their parents to the queue.

When the history is linear, the queue has one commit in it at a time: the one commit gets removed and shown, and the queue now has the one parent in it and you see the parent.

When the history has a merge, the queue starts with one commit, Git pops the commit off the queue and shows it, and puts both parents in the queue. Then Git picks one of the two parents and shows you G or J, and puts F or I into the queue. The queue still has two commits in it. Git pops one off and shows that commit and puts another one on.

Eventually Git tries to put F on the queue when F is already on the queue. Git avoids adding it twice, so eventually the queue depth reduces to one commit again, in this case showing F, E, D, and so on. (The details here are a bit complicated: the queue is specifically a priority queue with the priority being determined by additional git log sorting parameters, so there are different ways that this can happen.)

You can view connections with git log --graph

If you add --graph to your git log command, Git will draw a somewhat crude ASCII-art graph with lines connecting child commits back to their parents. This is very helpful in telling you that the commit history you are viewing is not linear after all, even though git log is showing you one commit at a time (because it must).

Showing merge commits

I mentioned above that with -p or --patch, git log will show what changed in a commit by comparing the parent's snapshot/tree against the child's snapshot/tree. But for a merge commit, there are two (or even more) parents: there's no way to show you the comparison of the parent vs the child, because there are at least two parents.

What git log does, by default, is to give up entirely. It simply doesn't show a patch. Other commands do something more complicated, and you can convince git log to do that too, but let's just note that the default is for git log to give up here.

History Simplification (this is a clickable link to git log documentation)

When you run git log file.ext, Git will deliberately skip any non-merge commit where the diff (as obtained by comparing parent to child) does not touch file.ext. That's natural enough: if you have a chain like:

A--B--C--D--E   <-- master

and you changed (or first created) file.ext when you made commits A and E, you'd like to see just those two commits. Git can do this by figuring out a patch for D-vs-E and seeing that file.ext changed (so it should show E), then moving on to D. The C-vs-D comparison shows no change to file.ext, so Git won't show D, but it will put C in the priority queue and go on to visit C. That, too, has no change to the file, so Git eventually moves on to B, which has no change, and Git moves to A. For comparison purposes, all files in A are always new—that's the rule for any root commit; all files are added—so Git shows you A as well.

We just saw, though, that by default git log doesn't like to compute patches for a merge. It's too hard! So git log generally won't show you the merge here. It does, however, try to simplify away any part of the commit graph. As the documentation puts it, the default mode:

prunes some side branches if the end result is the same ...

If the commit was a merge, and [the file is the same as in] one parent, follow only that parent. ... Otherwise, follow all parents.

So at a merge commit like M in our graph, Git will do a fast check: is file.ext the same in M as in G? If so, add G to the queue. If not, is it the same in M as in J? If so, add J to the queue. Otherwise—i.e., file.ext is different in M than in both G and J—add both G and J to the queue.

There are other modes for History Simplification, which you can select with various flags. This answer is already too long so I will leave them to the documentation (see the above link).

Conclusion

You cannot draw too many inferences from what git log -- path shows you, because of the history simplification that Git performs. If you want to see everything, consider running git log --full-history -m -p -- path instead. The -m option splits each merge for git diff purposes (this goes with the -p option), and the --full-history forces Git to follow all parents at all times.

这篇关于合并后分支的 Git 历史记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆