如何读取索引diff --git输出 [英] how to read index diff --git output

查看:94
本文介绍了如何读取索引diff --git输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个补丁程序

  diff --git a / tools / python / xen / lowlevel / xc / xc .cb / tools / python / xen / lowlevel / xc / xc.c 
15索引e220f68..e611b24 100644
16 --- a / tools / python / xen / lowlevel / xc / xc.c
17 +++ b / tools / python / xen / lowlevel / xc / xc.c
18 @@ -228,6 +228,7 @@静态PyObject * pyxc_vcpu_setaffinity(XcObject * self,
19 int vcpu = 0,i;
20 xc_cpumap_t cpumap;
21 PyObject * cpulist = NULL;

我想知道哪个 commit 会生成补丁,以及如何解析 15索引e220f68..e611b24 100644

解决方案

让我们看看 git的输出show 。(这是真实回购的实际输出,尽管我会剪断大多数位。)

  $ git show d362e62 
commit d362e62490dd7f59c170a0a050a203fa0eda9f5a
[snip]
diff --git a / fmt.py b / fmt.py
index c44c267..ba772ee 10075 5
[snip]

在这里, d362e62 是提交的真实名称(即其SHA-1)的简短版本。 长格式是完整的40个字符的版本,是 git show 输出的第一行。



除了提交文本之外,提交本身还包含一个树(以及零个或多个父级)。我们可以通过 git cat-file -p 看到这一点:

  $ git cat-file -p d362e62 
树0b9bebfee8890b242875af0df209fd9f335bf14d
父级41f3a6bcba1f5f7059133f862727809f49ff4657
[snip author,committer,and commit text]


我们也可以看树。我可以使用上面的真实名称 SHA-1,但是在这里我使用了一些git语法:提交标识符后跟 ^ {tree} 告诉git提取

  $ git cat-file -p d362e62 ^ {tree} 
[snip]
120000 blob 7417b50d02819bbebeacac0f4104850549935f7089 fmt
100755 blob ba772eeb6139de5a724d67d18ce01bfccaf57590 fmt.py
[snip]

我在该行中留了 fmt ,因为它是 fmt.py 的符号链接。符号链接的模式为 120000 ,它告诉git blob 数据实际上是符号链接的目标。 fmt.py 文件的模式为 100755 ,它告诉git这是一个普通文件并且可以执行(这是一个Python脚本)。 这是您在索引中看到的 100644 100755 的来源行。



git回购中blob(文件对象)的真实名称是40个字符的SHA- 1。 fmt.py 的7个字符的缩写版本是 ba772ee 这是两个中的第二个数字。.索引分隔的数字c>行。



该行上的 first 数字是文件的上一个版本,即我创建提交<$>之前,在仓库中的 fmt.py 版本c $ c> d362e62



我们也可以使用另一种特殊的git语法来查看它们。 1 gitrevisions 中所述,遵循提交说明带有帽子字符(抑扬扬声符号,向上箭头,无论您喜欢叫什么) ^ 告诉git查找该提交的第一个父级。因此:

  $ git rev-parse d362e62 ^ 
41f3a6bcba1f5f7059133f862727809f49ff4657

告诉我们我提交给 git show 的提交之前是名为 41f3a6b ... 的那个。而且,可以肯定的是,如果我们 git cat-file -p 那,我们将得到另一棵树的提交,如果我们 git cat-file 那个树ID并查找 fmt.py ,我们将找到另一个 blob 和另一个SHA -1:

  $ git cat-file -p 41f3a6b 
树cbfb63beec96eebf0c73ba6a501cc8151adfec8a
父级80eeb496ea3f538aa14acdc6b0815024a5e99 b [snip]
$ git cat-file -p cbfb63beec96eebf0c73ba6a501cc8151adfec8a | grep fmt.py
100755 blob c44c267c4603838ac7a54aa450b33d0dd7a8bebc fmt.py
$

它是: cc4c267 是存储在 previous 提交中的文件真实名称的缩写形式。 这是索引行中的第一个编号。



<我以长篇幅写了所有这些内容,以说明git如何从 A点到达 B点。但是,就像使用简短语法 d362e62 ^ {tree} 一样,有一种非常简单的方法可以使用获取Blob SHA-1值。 git rev-parse

  $ git rev-parse d362e62:fmt.py 
ba772eeb6139de5a724d67d18ce01bfccaf57590
$ git rev-parse d362e62 ^:fmt.py
c44c267c4603838ac7a54aa450b33d0dd7a8bebc


$ b $如果要使用简化版本,请使用 git rev-parse --short 将SHA-1值截断为(通常)7个字符。



所以:


我想知道哪个提交生成补丁,以及如何解析补丁中的 15索引e220f68..e611b24 100644


15 是行号 you (或某处的某人)添加了,现在您知道 index 行中其余的值了。但是要找到 commit -那是困难的部分。 commit 是查找其他值的地方。从其他值到提交之间没有链接:箭头实际上只是从提交指向树,然后从树指向blob。

Git始终以某种外部指定的名称开头。通常,这是分支名称或标记,或者是符号引用(如 HEAD 通常是这样,当您没有分离的头部时)。该引用找到一个提交。 2 如果该引用是分支名称,则该提交是该分支的提示。 3 如果是标签,它仍然会找到一个承诺。如果它是 HEAD ,而 HEAD 是分支的名称,例如 master ,git只是将 HEAD 转换为 master ,然后将 master 提交。换句话说,提交是从这里开始的,通常是从名称到提交ID,但是您几乎总是可以在此处指定一个原始 SHA-1 ID。



一旦git有一个commit-ID,该commit会标识更多的commit(它的父对象)和一棵树。如果需要,树可标识子树,树及其子树可标识斑点。从所有具有外部名称的提交开始,git最终会找到所有树和所有blob,并且以这种方式未找到 的存储库中的任何树或blob都有资格进行垃圾收集,当您运行 git gc (或自动运行 git gc 时)。 (这是删除的分支以及git在内部创建的任意数量的特殊临时文件的清除方式。)






1 Git具有 lot 的特殊语法。在我看来,最有用的记忆是:




  • 事后=父母的帽子: master ^ =主 的父母

  • 波浪号和数字 N 之后=备份 N 父母: master〜2 = 母版的祖父母

  • X .. Y = Y 选择的所有修订,不包括 X git log master..devel =记录分支母版上的>开发



.. 语法也用在 git diff 中,但此处不是 <$上的内容 c $ c> Y 不在 X 上,您可以直接比较该版本与 X 关联的版本,而不是与 Y 关联的版本。



2 我故意跳过在带注释的标签上也有存储库实体。在某些情况下,git将访问标签对象,而在其他情况下(当它需要提交,树和/或blob时),git将自动跟随带注释的标签。在内部,带注释的标签看起来与提交非常相似,不同之处在于,它不是引用树和父对象,而是引用了另一个git存储库对象-通常直接引用到提交,但有时也引用了另一个标签,理论上,您可以

3 分支名称始终指向其自身的顶端分支,但该分支可能只是另一个分支的一部分。例如,假设您有一个不错的线性提交序列:

  ...<-C3<-C4 <-C5<-C6<-C7 

其中C7以C6作为父级,C6有C5,依此类推。如果分支标签 X 是提交 C5 的引用,则分支X在C5处结束。如果分支标签 Y 指向C7,则分支Y在C7处结束。在这种情况下,分支Y包含分支X,但反之则不。


I have a patch looks like

diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/x    c/xc.c
 15 index e220f68..e611b24 100644
 16 --- a/tools/python/xen/lowlevel/xc/xc.c
 17 +++ b/tools/python/xen/lowlevel/xc/xc.c
 18 @@ -228,6 +228,7 @@ static PyObject *pyxc_vcpu_setaffinity(XcObject *self,
 19      int vcpu = 0, i;
 20      xc_cpumap_t cpumap;
 21      PyObject *cpulist = NULL;

And I want to know which commit generates the patch, and how to parse 15 index e220f68..e611b24 100644 in the patch?

解决方案

Let's take a look at output from git show. (This is actual output from a real repo, although I'll snip most bits.)

$ git show d362e62
commit d362e62490dd7f59c170a0a050a203fa0eda9f5a
[snip]
diff --git a/fmt.py b/fmt.py
index c44c267..ba772ee 100755
[snip]

Here, d362e62 is the "short version" of the true name of the commit, i.e., its SHA-1. The "long" form is the full 40-character version, which is the first line of git show output.

Besides the commit text, the commit itself contains a "tree" (and zero or more "parents"). We can see this with git cat-file -p:

$ git cat-file -p d362e62
tree 0b9bebfee8890b242875af0df209fd9f335bf14d
parent 41f3a6bcba1f5f7059133f862727809f49ff4657
[snip author, committer, and commit text]

We can look at the "tree" as well. I could use the "true name" SHA-1 above, but here I use a bit of git syntax: a commit identifier followed by ^{tree} tells git to extract the tree ID from the commit ID.

$ git cat-file -p d362e62^{tree}
[snip]
120000 blob 7417b50d02819bbebeac0f4104850549935f7089    fmt
100755 blob ba772eeb6139de5a724d67d18ce01bfccaf57590    fmt.py
[snip]

I left in the line for fmt as it is a symlink to fmt.py. The symlink has mode 120000, which tells git that the blob data is actually the target of the symlink. The file, fmt.py, has mode 100755, which tells git that it's an ordinary file and that it is executable (it's a Python script). This is the source of the 100644 or 100755 you see in the index line.

The "true name" of the blob (file object) in the git repo is that 40-character SHA-1. The 7-character abbreviated version for fmt.py is ba772ee. This is the second number in the two ..-separated numbers on the index line.

The first number on that line is the "true name" in the git repo of the previous version of the file, i.e., the version of fmt.py that was in the repo before I created commit d362e62.

We can use another bit of special git syntax to see these as well.1 As documented in gitrevisions, following a commit-specifier with a hat character (circumflex, up-arrow, whatever you like to call it) ^ tells git to find the first parent of that commit. So:

$ git rev-parse d362e62^
41f3a6bcba1f5f7059133f862727809f49ff4657

tells us that the commit before the commit I gave to git show is the one named 41f3a6b.... And, sure enough, if we git cat-file -p that, we get another commit with another tree, and if we git cat-file that tree-ID and look for fmt.py we will find another blob with another SHA-1:

$ git cat-file -p 41f3a6b
tree cbfb63beec96eebf0c73ba6a501cc8151adfec8a
parent 80eeb496ea3f538aa14acdc6b0815024a5e99c7e
[snip]
$ git cat-file -p cbfb63beec96eebf0c73ba6a501cc8151adfec8a | grep fmt.py
100755 blob c44c267c4603838ac7a54aa450b33d0dd7a8bebc    fmt.py
$ 

And there it is: cc4c267 is the abbreviated form of the "true name" of the file stored in the previous commit. This is the first number in the index line.

I wrote this all out in long form to illustrate how git gets from "point A" to "point B". But, just as with the short-hand syntax d362e62^{tree}, there is a very easy way to get the blob SHA-1 values using git rev-parse:

$ git rev-parse d362e62:fmt.py
ba772eeb6139de5a724d67d18ce01bfccaf57590
$ git rev-parse d362e62^:fmt.py
c44c267c4603838ac7a54aa450b33d0dd7a8bebc

If you want the shortened versions, use git rev-parse --short to truncate the SHA-1 values to (normally) 7 characters.

So:

And I want to know which commit generates the patch, and how to parse 15 index e220f68..e611b24 100644 in the patch?

The 15 is a line number you (or someone somewhere) added, and now you know what the rest of the values on the index line are. But to find the commit—well, that's the hard part. The commit is what finds the other values. There is no link from "other values" back to "commit": the "arrows", as it were, only point from commits to trees, and then from trees to blobs. There are no pointers from blobs to trees, nor from trees to commits.

Git always starts with some sort of externally specified name. Usually this is a branch name or tag, or a "symbolic reference" (as HEAD normally is, when you don't have a "detached head"). The reference locates a commit.2 If the reference is a branch name, that commit is the "tip" of that branch.3 If it's a tag, it still finds a commit. If it's HEAD, and HEAD is the name of a branch like master, git just turns HEAD into master and then turns master into a commit. In other words, the commit is where you start, usually by going from name to commit-ID—but you can almost always specify a "raw" SHA-1 ID here.

Once git has a commit-ID, that commit identifies more commits (its parents) and a tree. The tree identifies sub-trees if needed, and the tree and its sub-trees identify blobs. Starting from all the commits that have "external names", git eventually finds all trees and all blobs—and any trees or blobs in the repository that are not found this way are eligible for garbage-collection, when you run git gc (or when git gc runs automatically). (This is how deleted branches, and any number of special temporary files that git creates internally, are cleaned-up later.)


1Git has a lot of special syntax. The most useful ones to memorize, in my opinion:

  • hat after thing = parent: master^ = "parent of master"
  • tilde and number N after thing = back up N parents: master~2 = "grandparent of master"
  • X..Y = "all revisions selected by Y, excluding all revisions selected by X": git log master..devel = "log all commits on branch devel that are not on master"

The .. syntax is also used in git diff, but here instead of "stuff on Y that's not on X", you get a direct comparison of the version associated with X against the version associated with Y.

2I'm deliberately skipping over "annotated tags", which also have repository entities. In some cases git will access the tag object, and in others—when it needs a commit, tree, and/or blob—git will automatically follow the annotated tag. Internally, an annotated tag looks very similar to a commit, except that instead of a tree and parents, it has a reference to another git repository object—usually directly to a commit, but sometimes to another tag, and in theory you can make an annotated tag for a tree or a blob, skipping over the commit part entirely.

3A branch name always points to the tip of its own branch, but that branch may be just a part of another branch. For instance, suppose you have a nice linear sequence of commits:

...<-- C3 <-- C4 <-- C5 <-- C6 <-- C7

where C7 has C6 as its parent, C6 has C5, and so on. If branch label X is a reference to commit C5, then branch X ends at C5. If branch label Y points to C7, branch Y ends at C7. In this case branch Y "contains" branch X, but not vice versa.

这篇关于如何读取索引diff --git输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆