需要获取两次Git提交之间的所有文件差异(添加,修改,重命名) [英] Need to get all file differences (added, modified, renamed) between two Git commits

查看:161
本文介绍了需要获取两次Git提交之间的所有文件差异(添加,修改,重命名)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正尝试导出所有两次提交之间有差异的文件,这些差异是:

I'm trying to export all files with differences between two commits, those differences being:

  • 新文件(已添加)
  • 修改后的文件
  • 重命名文件
  • 如果可能,有关任何已删除文件的信息

检测重命名可能很困难,因为我将在 Windows 7 环境中进行导出,因此 somefile.php SomeFile相同.php ;但我会将它们上传到 * nix 环境中,该环境的确将这些文件视为不同的文件,因此,如果可能的话,需要对其进行识别和导出.

Detecting renames may be a tough one as I will be doing the exporting on a Windows 7 environment and hence somefile.php is the same as SomeFile.php; but I will be uploading them to a *nix environment, which does treat those files as being different, so they are needed to be recognized and exported if possible.

我正在使用以下命令:

git diff-tree -r --no-commit-id --name-only --diff-filter=ACMRT $head_commit_id $older_commit_id | xargs tar -cf project.tar -T -

但是我注意到它没有导出新的/添加的文件,也没有导出重命名的文件;然后,我发现默认情况下 git diff-tree 不会进行重命名检测,因此从我看到的情况来看,我需要将-find-renames 添加到命令?

However I noticed it was not exporting new/added files and also was not exporting renamed files; I then found out that git diff-tree doesn't do rename detection by default, so from what I can see I would need to add --find-renames to the command?

推荐答案

CodeWizard的答案一样,您可以使用用户友好"(或瓷器)命令 git diff 而不是 git diff-tree ,这就是Git所说的管道命令,用于脚本中.但是,您应该知道这意味着什么.

As in CodeWizard's answer, you can use the "user-friendly" (or porcelain) command git diff instead of git diff-tree, which is what Git calls a plumbing command, meant for use in scripts. You should, however, be aware of what this means.

由于瓷器命令是为人类使用的,因此它们尝试以人类可读的方式呈现事物.这意味着他们在各种配置文件中服从一个人特别为他/她自己设置的任何设置.其中包括 diff.renames diff.renameLimit 配置.他们还可以修改其输出,以使眼球更容易处理,而计算机程序更难处理.最糟糕的是,如果人们似乎更喜欢一些默认值,他们可能会将其输出从一个Git版本更改为另一个版本.

Since porcelain commands are meant for humans, they try to present things in human-readable fashion. This means they obey any setting that the one human in particular has set for himself/herself, in the various configuration files. That includes the diff.renames and diff.renameLimit configurations. They may also modify their output to make it easier for eyeballs, yet harder for computer programs, to deal with. Worst, they may change their output from one Git version to another, if people seem to prefer some default.

由于脚本不是上述的 ,因此它们以可预测的方式运行,其输出不会更改 ,也不依赖于配置项.这样,无论您提出什么要求,您都将得到:您将以可靠的形式获得可靠的输出,因此,如果您编写自己的可靠代码,则在某种情况下,它不仅仅可以在今天使用; 1

Since scripts are not meant for the above, they behave in predictable ways, with output that does not change, nor depend on configuration items. That way, whatever you request, you get: you will get reliable output in a reliable form, so that if you write your own reliable code, it will not just work today, for one case; it will keep working in the future, for all cases where it can.1

最后,这意味着如果使用 git diff-tree 并设置正确的标志,则将获得更可靠的输出.如果使用 git diff ,则重命名检测取决于:

In the end, what this means is that if you use git diff-tree and set the right flags, you will get more reliable output. If you use git diff, your rename detection depends on:

  • the user's configuration, and
  • the version of Git: rename detection in porcelain commands defaults to on in 2.9.0 and above, but off in earlier versions of Git.

您已经发现,重命名检测的输出是两个路径名,您不能仅通过管道将此路径传递给存档器.存档者通常会遇到文件删除的问题(这也许是档案备份/快照的经典区别之一>;请注意,这两个都与版本控制也有关.

As you discovered, the output from rename-detection is two pathnames, which is not something you can just pipe to an archiver. Archivers in general have issues with file deletion (this is, perhaps, one classic difference between archives and backups / snapshots; note that both of these are related to version control as well).

如果您的目标是所有文件的并集,即,如果差异说明添加了一个名为 A 的文件,则删除一个名为 D 的文件,然后文件 R 是通过重命名旧名称 O 来创建的(也许还可以对其进行修改:请注意,Git的相似性索引编号在字母 R ),那么您希望收集文件 A ,忽略文件 D ,并收集文件 R ,而忽略文件 O -那么,您想要首先要检测重命名!如果您检测重命名(默认情况下, git diff-tree 不重命名),则该差异将显示为:添加文件 A ,删除文件 D ,删除文件 O 和添加文件 R .因此,带有 diff-filter git diff-tree 包括 AM ,但不包括 D .不清楚如何处理 T ,这是一种类型更改:例如,从普通文件到符号链接,或者从文件到子存储库提交哈希(Git称为 gitlink 条目,用于子模块).

If your goal is a sort of union of all files—i.e., if the diff says that a file named A was added, one named D was deleted, and file R was created by renaming the old name O (and perhaps also modifying it: note Git's similarity index number that comes after the letter R), then you wish to collect file A, ignore file D, and collect file R while ignoring file O—well, then, what you want is to not detect renames in the first place! If you do not detect renames—which git diff-tree does not by default—this same diff will be presented as: add file A, delete file D, delete file O, and add file R. So a git diff-tree with a diff-filter that includes AM and excludes D suffices. It is less clear what to do with T, which is for a type-change: from ordinary file to symbolic link, for instance, or from file to sub-repository commit hash (what Git calls a gitlink entry, for a submodule).

类似地,您不想要启用复制检测:像 R 这样的 C 状态会显示相似性索引,而 pair 的路径名.如果将其禁用,则只需将新路径名作为 A ddd文件即可.

Similarly, you don't want to enable copy detection: a C status, like R, presents a similarity index and a pair of pathnames. If you leave it disabled, you simply get the new pathname as an Added file.

即使您执行了所有这些操作,您仍然会陷入陷阱.假设提交哈希 C1 有一个名为 problem 的文件,并且(大概以后)提交哈希 C2 有两个名为 problem的文件/A problem/B .这意味着原始文件 problem 在这两点之间被删除,因为大多数系统(包括Git本身)都禁止同时拥有名为 problem 和一个名为 problem 目录,其中包含各种文件.假设每个tar归档文件本身都不是完整的快照,则您可以忽略在 C1 C2 之间未修改的文件.>提取这些快照必须是加性的:先提取较早的快照,然后再在较早的快照之上提取较晚的快照.在文件问题创建目录问题的过程中,该过程将失败.显然,您可以检查此类问题并删除有问题的文件(现在可以看到为什么我将文件命名为 problem :-)),但更一般的是,因为您没有在其中存储删除"指令首先,在将来使用这些归档文件重建快照时,您将不知道某些文件根本不属于该快照.

Even if you do all this, you are still set up for a pitfall. Suppose that commit hash C1 has a file named problem, and a (presumably later) commit hash C2 has instead two files named problem/A and problem/B. This implies that the original file problem was deleted between these two points, because most systems (including Git itself) forbid having both a file named problem and a directory named problem holding various files. Given that each tar-archive itself is not a complete snapshot—you omit files that are unmodified between C1 and C2—your procedure for extracting these snapshots must necessarily be additive: extract earlier snapshot, then extract later snapshot atop earlier snapshot. This process will fail at the point where file problem is in the way of creating directory problem. Obviously, you can check for such problems and remove the problematic file (you can see now why I named the file problem :-) ), but more generally, since you are not storing "delete" directives in the first place, you won't know, in a future case where you are using these archives to rebuild a snapshot, that some files don't belong in that snapshot at all.

(解决此问题的经典解决方案是在update-archives前面添加某种清单或指令.如果您决定使用这种解决方案,则取决于所需的详细信息种类在清单或指令中,您可能需要进行第一遍以检测精确重命名和/或精确副本.)

(The classic solution to this problem is to prefix update-archives with some kind of manifest or directive. If you decide to use such a solution, then, depending on the kind of detail you want in the manifest-or-directive, you might want to do a first pass to detect exact renames and/or exact copies.)

1 很明显,新添加的功能会给每个人带来问题,不仅是脚本而且不仅是人类,而且Git员工会努力工作,不会为不必要的问题依赖管道命令的脚本.例如,考虑推动Git朝使用某种SHA-256风格(而不是SHA-1或除SHA-1之外)发展的新动力..由于SHA-1产生160位的哈希值,而SHA-256产生256位的哈希值,因此必须分别用40和64个十六进制数字表示.Linus建议默认情况下将256位哈希缩写为40个字符,以帮助假定 40个字符的现有脚本,但我预见到一些问题...:-)

1Obviously, newly added features can present problems for everyone, not just scripts and not just humans, but the Git folks do work hard on not creating unnecessary problems for scripts that rely on plumbing commands. Consider, for instance, the new impetus to push Git toward using some flavor of SHA-256 instead of, or in addition to, SHA-1. Since SHA-1 produces 160-bit hashes, and SHA-256 produces 256 bit hashes, these must be represented as 40 and 64 hexadecimal digits respectively. Linus suggested abbreviating 256-bit hashes to 40 characters by default, to help out existing scripts that assume 40 characters, but I foresee some problems... :-)

这篇关于需要获取两次Git提交之间的所有文件差异(添加,修改,重命名)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆