Git 如何在重构期间跟踪历史记录? [英] How does Git track history during a refactoring?

查看:25
本文介绍了Git 如何在重构期间跟踪历史记录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很清楚 Git 如何支持文件移动:因为它使用文件哈希,所以添加"的文件很容易被检测为与删除"的文件相同.

我的问题是关于重构:考虑到 Java,包声明发生了变化,因此文件内容将不相同.在这种情况下,Git 如何确定添加"文件与删除"文件共享历史记录?它是否会检查最相似的内容"(假设我只做了很小的更改)或类似的非确定性解决方案?

解决方案

Git FAQ,它会根据启发式检测类似的内容.

<块引用>

Git 必须与许多不同的工作流程进行互操作,例如,某些更改可能来自补丁,其中重命名信息可能不可用.依赖显式重命名跟踪使得合并完成相同事情的两棵树变得不可能,除了一个是作为补丁(创建/删除)来做的,另一个是使用其他启发式来做的.

再说明一下,跟踪重命名实际上只是跟踪内容如何在树中移动的特例.在某些情况下,您可能对查询何时将函数添加或移动到不同的文件感兴趣.通过仅依赖于在需要时重新创建此信息的能力,Git 旨在提供一种更灵活的方式来跟踪您的树是如何变化的.

然而,这并不意味着 Git 不支持重命名.
Git 中的 diff 机制支持自动检测重命名,这是通过-M"切换到 git-diff-* 命令系列来启用的.
重命名检测机制由 git-log(1) 和 git-whatchanged(1) 使用,例如,'git log -M' 将提供提交历史带有重命名信息.
Git 还支持有限形式的跨重命名合并.
git-blame(1)git-annotate(1) 这两个用于分配责任的工具都使用自动重命名检测代码来跟踪重命名.

<小时>

git log 为您提供了一些细节启发式:

-B[][/]

<块引用>

将完整的重写更改分解为删除和创建对.这有两个目的:

  • 它影响的更改方式相当于完全重写文件,而不是将一系列删除和插入与碰巧与上下文文本匹配的极少数行混合在一起,而是作为单个删除所有旧的,然后插入所有新的,数字 m 控制 -B 选项的这一方面(默认为 60%).
    -B/70% 指定结果中应该保留少于原始的 30% 以便 git 认为它是完全重写(即,否则生成的补丁将是一系列删除和插入混合在一起带有上下文行).

  • 与-M一起使用时,完全重写的文件也被视为重命名的来源(通常-M只将消失的文件视为重命名的来源),数字n控制这个-B 选项的方面(默认为 50%).
    -B20% 指定与文件大小的 20% 或更多相比,添加和删除的更改有资格被选择作为重命名另一个文件的可能来源.

-M[]

<块引用>

如果生成差异,检测并报告每次提交的重命名.有关在遍历历史记录时跨重命名的后续文件,请参阅 --follow.
如果指定了 n,则 a 是相似性索引的阈值(即与文件大小相比的添加/删除数量).
例如,-M90% 表示如果超过 90% 的文件没有更改,git 应该将删除/添加对视为重命名.

<小时>

其他参考:

<小时>

注意:在 Git 2.18(2018 年第二季度)中,git status 现在应该会在您移动/重命名文件时显示重命名(而不是删除/添加文件).
请参阅如何告诉 Git 它是同一个目录,只是名称不同".

I understand well how Git can support file moves : as it uses file hash, a "added" file is easily detected as beeing same as the "removed" one.

My question is about refactoring : considering Java, the package declaration changes so the file content will NOT be the same. In such case, how does Git determine that the "added" file shares history with the "removed" one ? Does it check for "most similar content" assuming I only made minor changes, or similar non-deterministic solution ?

解决方案

As mentioned in Git FAQ, it will detect similar content based on an heuristic.

Git has to interoperate with a lot of different workflows, for example some changes can come from patches, where rename information may not be available. Relying on explicit rename tracking makes it impossible to merge two trees that have done exactly the same thing, except one did it as a patch (create/delete) and one did it using some other heuristic.

On a second note, tracking renames is really just a special case of tracking how content moves in the tree. In some cases, you may instead be interested in querying when a function was added or moved to a different file. By only relying on the ability to recreate this information when needed, Git aims to provide a more flexible way to track how your tree is changing.

However, this does not mean that Git has no support for renames.
The diff machinery in Git has support for automatically detecting renames, this is turned on by the '-M' switch to the git-diff-* family of commands.
The rename detection machinery is used by git-log(1) and git-whatchanged(1), so for example, 'git log -M' will give the commit history with rename information.
Git also supports a limited form of merging across renames.
The two tools for assigning blame, git-blame(1) and git-annotate(1) both use the automatic rename detection code to track renames.


git log gives you some details about that heuristic:

-B[<n>][/<m>]

Break complete rewrite changes into pairs of delete and create. This serves two purposes:

  • It affects the way a change that amounts to a total rewrite of a file not as a series of deletion and insertion mixed together with a very few lines that happen to match textually as the context, but as a single deletion of everything old followed by a single insertion of everything new, and the number m controls this aspect of the -B option (defaults to 60%).
    -B/70% specifies that less than 30% of the original should remain in the result for git to consider it a total rewrite (i.e. otherwise the resulting patch will be a series of deletion and insertion mixed together with context lines).

  • When used with -M, a totally-rewritten file is also considered as the source of a rename (usually -M only considers a file that disappeared as the source of a rename), and the number n controls this aspect of the -B option (defaults to 50%).
    -B20% specifies that a change with addition and deletion compared to 20% or more of the file's size are eligible for being picked up as a possible source of a rename to another file.

-M[<n>]

If generating diffs, detect and report renames for each commit. For following files across renames while traversing history, see --follow.
If n is specified, it is a is a threshold on the similarity index (i.e. amount of addition/deletions compared to the file's size).
For example, -M90% means git should consider a delete/add pair to be a rename if more than 90% of the file hasn't changed.


Additional references:


Note: With Git 2.18 (Q2 2018), git status should now show you renames (instead of delete/add files) when you move/rename files.
See "How to tell Git that it's the same directory, just a different name".

这篇关于Git 如何在重构期间跟踪历史记录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆