Git如何在重构过程中跟踪历史记录? [英] How does Git track history during a refactoring?

查看:123
本文介绍了Git如何在重构过程中跟踪历史记录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很清楚Git如何支持文件移动:因为它使用文件哈希,所以很容易检测到与添加文件相同的文件。 p>我的问题是关于重构:考虑到Java,包声明发生了变化,所以文件内容不会相同。在这种情况下,Git如何确定已添加文件与已删除文件共享历史记录?它是否检查最相似的内容,假设我只做了微小的更改或类似的非确定性解决方案? 正如前面提到的在 Git FAQ 中,它会检测到类似的基于启发式的内容。


Git必须与许多不同的工作流程进行互操作,例如一些更改可能来自修补程序,其中重命名信息可能不可用。依靠明确的重命名跟踪,不可能合并两棵完全相同的东西,除了一个做了一个补丁(创建/删除),还有一个做了一些其他的启发式操作。



第二个说明中,跟踪重命名实际上只是跟踪树中内容移动方式的特殊情况。在某些情况下,您可能有兴趣查询函数何时添加或移动到不同的文件。通过仅依赖于在需要时重新创建此信息的能力,Git旨在提供一种更灵活的方式来跟踪树的变化。然而,这并不意味着Git不支持重命名。

Git中的差异机制支持自动检测重命名,这是由' -M '开启的。切换到 git-diff - * 系列命令。

重命名检测机制由git-log(1)和git -whatchanged(1),例如,' git log -M '会给提交历史记录提供重命名信息。

Git也支持跨越重命名的有限合并形式。

分配责任的两种工具, git-blame(1) git-annotate(1)都使用自动重命名检测代码来跟踪重命名。







git log 为您提供了有关该启发式的一些细节:

  -B [< n>] [/ m]] 




将完全重写更改分为删除和创建对。这有两个目的:


  • 它影响到一个文件总体重写的改变而不是一系列删除和插入混合在一起,只有很少几行发生在文本上与上下文相匹配,但是作为旧的一切删除操作,然后单个插入所有新操作,并且数字m控制 -B 选项(默认为60%)。

    -B / 70%指定少于30%的原始值应保留在git的结果中认为它是一个完全重写(即否则所得到的补丁将是一系列删除和插入与上下文行混合在一起)。

  • 当使用
  • 时与-M,一个完全重写的文件也被认为是重命名的来源(通常-M只考虑一个消失的文件作为重命名的来源),而数字n控制着 -B选项(默认为50%)

    -B20%指定添加和删除相对于文件大小的20%或更多的更改有资格作为重命名为另一个文件的可能来源获取。


>



  -M [< n>] 
pre>


如果生成差异,检测并报告每次提交的重命名。对于在遍历历史记录时跨越重命名的文件,请参见 - 关注

如果指定了n,则它是相似性指数(即与文件大小相比的添加/删除量)。
例如, -M90%表示git应该将删除/添加对视为重命名if超过90%的文件没有改变







其他参考资料:


I understand well how Git can support file moves : as it uses file hash, a "added" file is easily detected as beeing same as the "removed" one.

My question is about refactoring : considering Java, the package declaration changes so the file content will NOT be the same. In such case, how does Git determine that the "added" file shares history with the "removed" one ? Does it check for "most similar content" assuming I only made minor changes, or similar non-deterministic solution ?

解决方案

As mentioned in Git FAQ, it will detect similar content based on an heuristic.

Git has to interoperate with a lot of different workflows, for example some changes can come from patches, where rename information may not be available. Relying on explicit rename tracking makes it impossible to merge two trees that have done exactly the same thing, except one did it as a patch (create/delete) and one did it using some other heuristic.

On a second note, tracking renames is really just a special case of tracking how content moves in the tree. In some cases, you may instead be interested in querying when a function was added or moved to a different file. By only relying on the ability to recreate this information when needed, Git aims to provide a more flexible way to track how your tree is changing.

However, this does not mean that Git has no support for renames.
The diff machinery in Git has support for automatically detecting renames, this is turned on by the '-M' switch to the git-diff-* family of commands.
The rename detection machinery is used by git-log(1) and git-whatchanged(1), so for example, 'git log -M' will give the commit history with rename information.
Git also supports a limited form of merging across renames.
The two tools for assigning blame, git-blame(1) and git-annotate(1) both use the automatic rename detection code to track renames.


git log gives you some details about that heuristic:

-B[<n>][/<m>]

Break complete rewrite changes into pairs of delete and create. This serves two purposes:

  • It affects the way a change that amounts to a total rewrite of a file not as a series of deletion and insertion mixed together with a very few lines that happen to match textually as the context, but as a single deletion of everything old followed by a single insertion of everything new, and the number m controls this aspect of the -B option (defaults to 60%).
    -B/70% specifies that less than 30% of the original should remain in the result for git to consider it a total rewrite (i.e. otherwise the resulting patch will be a series of deletion and insertion mixed together with context lines).

  • When used with -M, a totally-rewritten file is also considered as the source of a rename (usually -M only considers a file that disappeared as the source of a rename), and the number n controls this aspect of the -B option (defaults to 50%).
    -B20% specifies that a change with addition and deletion compared to 20% or more of the file's size are eligible for being picked up as a possible source of a rename to another file.

-M[<n>]

If generating diffs, detect and report renames for each commit. For following files across renames while traversing history, see --follow.
If n is specified, it is a is a threshold on the similarity index (i.e. amount of addition/deletions compared to the file's size).
For example, -M90% means git should consider a delete/add pair to be a rename if more than 90% of the file hasn't changed.


Additional references:

这篇关于Git如何在重构过程中跟踪历史记录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆