为什么在git diff与git diff --staged下输出会有所不同? [英] Why does output differ under git diff vs. git diff --staged?

查看:97
本文介绍了为什么在git diff与git diff --staged下输出会有所不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么我们都需要这两个?在什么情况下它们在命令行上提供的输出会有所不同?

您能否解释一下在不同情况下(例如添加文件,暂存和修改)两者之间的区别.那么什么是分阶段和非分阶段的更改?

解决方案

我认为您被误导了. Git根本不存储更改.直到您意识到Git只是存储完好所有东西,然后以一种怪异的方式存储为止,整个事情似乎都是非常神秘的.

Git永久存储什么

首先也是最重要的一点,Git并不完全存储文件.这样做的目的是结束,但这是因为Git存储 commits ,并且每个单独的提交都包含(全部!)文件.也就是说,在开发的早期,您(或某人)告诉Git:这是整个文件树,其中有一些文件夹/目录集,包含文件和子目录,而子目录又包含更多文件和子目录,等等.在.对它们现在的外观进行快照.该快照,即所有内容的整个副本,将被提交到新提交中.

接下来,提交一旦完成,大部分都是永久性的,并且完全是完全100%只读的.您不能更改提交内的任何内容.您可以将它们视为永久性的:一次提交才能真正消失的唯一途径是,如果您谨慎地安排确保以后没有人(不是您自己,也没有其他人)可以找到git reset或类似工具.

由于许多原因,包括如果您进行许多提交以继续重复使用大多数文件的大多数旧版本,则不会使存储库变得非常胖,存储在提交中的文件将保存在特殊的压缩Git中仅限格式.由于提交中的文件被冻结,因此,除了一个文件之外,如果新提交 C9 与上一个提交 C8 一样,则两个提交将 share 所有相同的文件.

Git可以让您暂时使用的东西

由于您无法更改任何提交,因此,如果Git没有从某些提交提取的所有文件的方式,它将是无用的.提取提交后,将其所有文件从深度冻结中复制出来,然后解压缩文件,然后将它们转换回您和计算机可以使用的普通的日常文件中.这些文件是该Git提交中的内容的副本,但是在这里,在此工作区- work-tree 工作树中,它们对于您和您的计算机都很有用,您可以按自己喜欢的任何方式进行更改.

Git使其索引复杂化

现在有点棘手了.其他版本控制系统可能会在此处停止:它们也具有提交(以冻结形式永久保存文件)和可让您以普通形式处理文件的工作树.为了进行新的提交,那些其他版本控制系统会缓慢,痛苦地一个接一个地处理每个工作树文件,将其压缩以准备冻结,然后然后检查是否冻结文件将与旧文件相同.如果是这样,他们可以重新使用旧文件!如果没有,他们将尽一切努力保存新文件.这非常慢,而且通常有多种方法可以加快速度,但是在这些非Git版本控制系统中,使用"commit"命令后,您通常可以起身去喝咖啡,或去散步或吃午餐或其他东西.

Git做了一些根本不同的事情,与其他系统相比,git commit如此之快.当Git从冻结状态中取出文件放入您的工作树时,Git会保留半冻结的"slushy"(如果您愿意的话)的每个文件的副本,以备不时之需.进入 next 提交.最初,这些副本都与冻结的提交副本匹配.

这些肮脏的文件副本位于Git所谓的 index 临时区域缓存,具体取决于Git的谁或哪个部分正在执行呼叫.每个文件的这些索引副本与当前提交中的冻结副本之间的主要区别在于,提交副本实际上是冻结的.它们不能被更改.索引副本几乎只是冻结的:它们可以通过将新文件写入索引代替旧文件来更改.

最后,这意味着对于提交中的每个文件,当您告诉Git使该提交成为当前提交时,您得到的不是主动副本,而是两个三个 ,使用git checkout somebranch. (此结帐选择 somebranch 作为当前的分支名称,因此也提取了Git称为其 tip commit 的当前名称. commit.此当前提交始终有一个名称:Git称为HEAD.)例如,假设master的尖端提交有两个文件,分别名为README.mdmain.py:

   HEAD           index         work-tree
---------       ---------       ---------
README.md       README.md       README.md
main.py         main.py         main.py

此时,每个文件的所有三个副本相互匹配.也就是说,所有三个README.md都是相同的,只是格式不同:HEAD中的一个是冻结的且仅Git.索引中的一个是半冻结的且仅Git;您的工作树中的一个对您有用且有用;但所有三个代表相同的文件内容. main.py的三个副本也是如此.

现在假设您更改了一个(或两个)工作树文件.例如,假设您更改了工作树README.md.让我们用(2)标记它以表示不同,并用(1)标记旧的,以记住哪些是旧的:

    HEAD            index         work-tree
------------    ------------    ------------
README.md(1)    README.md(1)    README.md(2)
main.py(1)      main.py(1)      main.py(1)

您现在可以要求Git将每个文件的索引副本比较到每个文件的工作树副本中,这一次,您将看到您的更改README.md.

运行git add时,您实际上是在告诉Git:获取我要添加的文件的工作树副本,并准备将其冻结. Git将复制工作树.将README.mdmain.py(或两者)的副本复制回索引,对内容进行Git化处理,为下一次冻结做好准备:

    HEAD            index         work-tree
------------    ------------    ------------
README.md(1)    README.md(2)    README.md(2)
main.py(1)      main.py(1)      main.py(1)

这一次,要求Git将 index 副本(所有内容)与 work-tree 副本(所有内容)进行比较,结果什么都没有!毕竟它们是一样的.要查看差异,您必须要求Git将HEAD提交与索引进行比较,或者将HEAD提交与工作树进行比较.任一个都足够了,因为现在索引和工作树再次匹配.

但是请注意,使用git add后,您可以再次更改工作树副本 .假设您又修改了README.md次,给出了:

    HEAD            index         work-tree
------------    ------------    ------------
README.md(1)    README.md(2)    README.md(3)
main.py(1)      main.py(1)      main.py(1)

现在main.py的所有三个副本都匹配,但是README.md的所有三个副本都不同.因此,现在无论Git比较HEAD vs索引,还是HEAD vs工作树,还是index vs工作树,这都很重要:每个对象都将显示对README.md不同的更改. /p>

Git通过索引进行 new 提交

何时以及是否选择进行新提交(所有文件现在为 now 的新永久快照),Git使用索引中的半冻结文件进行新提交的快照.提交动词与它们有关的所有工作都是完成冻结过程(在技术层面上,该过程包括制作 tree 对象来保存它们,但您不必知道这一点).因此,git commit会收集您的姓名,电子邮件,时间,您的日志消息以及当前提交的哈希ID,冻结索引,然后将所有这些放到一个新的提交中.新的提交成为 HEAD提交,因此现在HEAD引用了 new 提交.如果旧的提交是 C8 ,而新的提交是 C9 ,则HEAD曾经是 C8 的意思,但现在是 C9 .

提交完成后,每个文件的HEAD和索引副本自动匹配.很明显,它们必须这样做,因为新的HEAD是从索引制成的.因此,如果使用包含README.md中间版本的索引进行新提交,则会得到:

    HEAD            index         work-tree
------------    ------------    ------------
README.md(2)    README.md(2)    README.md(3)
main.py(1)      main.py(1)      main.py(1)

请注意,在此过程中,Git完全忽略了工作树!有一种方法可以告诉git commit它应该查看工作树并自动 运行git add,但让我们将其留待以后.

此特定部分的摘要是考虑索引的一种好方法:索引包含您建议进行的 next 提交. git add命令的意思是:更新我建议的下一次提交.这说明了为什么必须一直git add.

Git的diff动词

因为每个文件都有这三个同时有效的副本,所以它们是永久的,一个是为 next 提交的,另一个是您可以实际查看并使用的副本— Git需要一种比较的方法. diff动词是您要求Git比较两件事的方式,它的选项是选择比较两件事的方式:

  • git diff commit-A commit-B告诉Git:将提交A中的快照提取到一个临时区域;将提交B中的快照提取到一个临时区域,然后将它们进行比较,然后告诉我有什么不同.通常,这很有用,但是在进行 new 提交时并没有太大作用,因为它是关于现有的,冻结的,不可更改的提交.

  • git diff-根本没有任何选项或提交说明符-告诉Git:将索引与工作树进行比较. Git不会查看任何实际的提交,只是看起来在索引处(提议的下一次提交),并与您可用的文件副本进行比较.每当有不同之处时,您可以使用git add将其复制到索引中.因此,这告诉您git add,如果您愿意的话.

  • git diff --cachedgit diff --staged-选项具有完全相同的含义-告诉Git:比较HEAD对索引的提交.这次,Git不查看您的工作树.它只是找出当前提交提议的下一次提交之间的区别.也就是说,如果您立即提交,这将是不同.

  • git diff HEAD(或更常见的是git diff commit)告诉Git:将我命名的提交(例如HEAD)中的内容与工作树中的内容进行比较.这次,Git忽略了您的 index ,只执行了特定的提交(例如HEAD)以及工作树的内容.这不如HEAD-vs-index或index-vs-work-tree比较有用,但是您可以根据需要进行.

当然,您可以使用更多方法来比较任意两个项目,因此git diff有很多选择.但是,这是目前主要的兴趣所在.

git status运行两个git diff s

请注意,在您积极开发时,上面的两个最有用 git diffgit diff --cached,它告诉您 有什么不同立即提交 git diff(没有选项),告诉您如果现在运行git add还有什么可能不同.您应该经常使用的git status命令为您运行这两个差异! 1

它在内部使用设置的--name-status标志运行它们,因此,它不显示实际的差异,而仅显示文件的名称.

让我们再次看到:git status运行两个 git diff命令.第一个是git diff --cached,即建议的提交的不同之处.这些是为提交而进行的更改.第二个是普通的git diff,即索引(建议的提交)和工作树的不同之处.这些是尚未进行提交的更改.

因此,现在您知道了git status会告诉您什么,以及何时需要使用带--staged或不带--stagedgit diff来查看不仅仅是文件的名称的信息.请记住,git diff向您显示的更改是Git正在弄清楚的内容:索引内或工作树中的文件是完整的完整副本.它们可能彼此不同和/或与HEAD中完整的完整副本不同.


1 --name-status的状态"部分可以改为说已添加文件 -在索引中,但不在HEAD提交中,例如.或者,在某些情况下,它可以说是文件已重命名或进行了其他一些辅助更改,但我们不要在这里讨论.

Why do we need both of these? And in what circumstances does the output they give on the command line differ?

Can you explain the differences between the two under different scenarios like adding files, staging, and modifying. So what are the staged and unstaged changes?

解决方案

You are, I think, being misled. Git doesn't store changes at all. The whole thing seems very mysterious until you realize that Git just stores everything intact, but does so in a weird way.

What Git stores permanently

First and most important, Git doesn't exactly store files. It winds up doing so, but that's because Git stores commits, and each individual commit contains (all!) the files. That is, at some earlier point during development, you—or someone—told Git: Here's this entire file-tree, some set of folders / directories containing files and sub-directories that contain more files and sub-directories and so on. Make a snapshot of how they all look right now. That snapshot, that entire copy of everything, goes into a new commit.

Next, commits, once made, are mostly permanent, and completely, totally, 100% read-only. You cannot change anything that's inside a commit. You can just think of them as permanent: the only time a commit can truly go away is if you carefully arrange to make sure that no one—not yourself, nor anyone else—can find it later, using git reset or similar tools.

For many reasons, including not having the repository get enormously fat if you make many commits that keep re-using most of the old versions of most files, the files that are stored inside commits are kept in a special, compressed, Git-only format. Since the files inside commits are frozen, if new commit C9 is just like its previous commit C8 except for one file, the two commits will share all the identical files, too.

What Git lets you work with, temporarily

Since you can't change any commit, at all, ever, Git would be useless if it did not have a way to extract all the files from some commit. Extracting a commit copies all of its files out of the deep-freeze, and then de-compresses the files and turns them back into ordinary, every-day files that you and your computer can work with. These files are copies of what was in that Git commit, but here, in this work area—the work-tree or working tree—they're useful to you and your computer, and you can change them any way you like.

Git complicates things with its index

Now comes the tricky bit. Other version control systems may stop here: they too have commits, that save the files forever in frozen form, and a work-tree, that let you work on the files in ordinary form. To make a new commit, those other version control systems slowly, painfully, one by one, take each work-tree file, compress it down to get it ready for freezing, and then check to see if that frozen file will be the same as the old one. If so, they can re-use the old file! If not, they do whatever it takes to save away the new file. This is terribly slow, and there are various ways to speed it up, which they do use in general, but in these non-Git version control systems, after using their "commit" command, you can often get up and go get coffee, or go for a walk or have lunch or something.

Git does something radically different, and this is how git commit is so fast, compared to those other systems. When Git is taking files out of the deep-freeze to put into your work-tree, Git keeps a sort of semi-frozen—"slushy", if you will—copy of every file, ready to go into the next commit. Initially, these copies all match the frozen commit copy.

These sort-of-slushy copies of files are in what Git calls, variously, the index, the staging area, or the cache, depending on who or what part of Git is doing the calling. The key difference between these index copies of every file, and the frozen copy in the current commit, is that the committed copies really are frozen. They can't be changed. The index copies are only almost frozen: they can be changed, by writing a new file into the index in place of that old one.

What this means, in the end, is that for every file in the commit, you wind up with not two but three active copies, when you tell Git to make that commit be the current commit, using git checkout somebranch. (This checkout selects somebranch as the current branch name and therefore also extracts what Git calls its tip commit to be the current commit. There's always a name for this current commit: Git calls it HEAD.) Suppose, for instance, that the tip commit of master has two files, named README.md and main.py:

   HEAD           index         work-tree
---------       ---------       ---------
README.md       README.md       README.md
main.py         main.py         main.py

At this point, all three copies of each file match each other. That is, all three README.mds are the same, except in terms of their format: the one in HEAD is frozen and Git-only; the one in the index is semi-frozen and Git-only; and the one in your work-tree is usable and useful to you; but all three represent the same file contents. The same goes for the three copies of main.py.

Now suppose you change one (or both) of the work-tree files. For instance, suppose you change your work-tree README.md. Let's mark it with a (2) to indicate that it's different, and mark the old ones with (1) to remember which the old ones were:

    HEAD            index         work-tree
------------    ------------    ------------
README.md(1)    README.md(1)    README.md(2)
main.py(1)      main.py(1)      main.py(1)

You can now ask Git to compare the index copies of every file to the work-tree copies of every file, and this time, you'll see your change to README.md.

When you run git add, you are really telling Git: Take the work-tree copy of the files I'm adding, and prepare them for freezing. Git will copy the work-tree copy of README.md or main.py (or both) back into the index, Git-ifying the contents, getting them ready for the next freeze:

    HEAD            index         work-tree
------------    ------------    ------------
README.md(1)    README.md(2)    README.md(2)
main.py(1)      main.py(1)      main.py(1)

This time, asking Git to compare the index copy (of everything) to the work-tree copy (of everything) shows nothing! They are the same, after all. To see a difference, you must ask Git to compare the HEAD commit to the index, or the HEAD commit to the work-tree. Either will suffice right now, because right now the index and work-tree match again.

Note, however, that you can change the work-tree copy again after you use git add. Suppose you modify README.md one more time, giving:

    HEAD            index         work-tree
------------    ------------    ------------
README.md(1)    README.md(2)    README.md(3)
main.py(1)      main.py(1)      main.py(1)

Now all three copies of main.py match, but all three copies of README.md are different. So now it matters whether you have Git compare HEAD vs index, or HEAD vs work-tree, or index vs work-tree: each will show a different change to README.md.

Git makes new commits from the index

When and if you do choose to make a new commit—a new permanent snapshot of all the files as they stand now—Git makes the new commit's snapshot using the semi-frozen files in the index. All that the commit verb has to do with them is finish the freezing process (which, at a technical level, consists of making tree objects to hold them, but you don't need to know this). So git commit collects your name, email, the time, your log message, and the current commit's hash ID, freezes the index, and puts all of those together into a new commit. The new commit becomes the HEAD commit, so that now HEAD refers to the new commit. If the old commit was C8 and the new one is C9, HEAD used to mean C8, but now it means C9.

Once that commit finishes, the HEAD and index copies of every file automatically match. It's obvious that they must, since the new HEAD was made from the index. So if you make that new commit with the index holding the middle version of README.md, you get:

    HEAD            index         work-tree
------------    ------------    ------------
README.md(2)    README.md(2)    README.md(3)
main.py(1)      main.py(1)      main.py(1)

Note that Git completely ignored the work-tree during this process! There's a way to tell git commit that it should look at the work-tree and automatically run git add, but let's leave that for later.

The summary of this particular section is that a good way to think of the index is: The index contains the next commit you propose to make. The git add command means: Update my proposed next commit. This explains why you have to git add all the time.

Git's diff verb

Because there are these three simultaneous, active copies of each file—one permanent, one proposed for the next commit, and one that you can actually see and work with—Git needs a way to compare these things. The diff verb is how you ask Git to compare two things, and its options are how you select which two things to compare:

  • git diff commit-A commit-B tells Git: Extract the snapshot in commit A to a temporary area; extract the snapshot in commit B to a temporary area, and then compare them and show me what's different. This is useful in general, but not so much when making a new commit, since it's about existing, frozen, unchangeable commits.

  • git diff—with no options or commit specifiers at all—tells Git: Compare the index to the work-tree. Git does not look at any actual commit, it just looks at the index—the proposed next commit—and compares to your usable copies of files. Whenever something is different, you could use git add to copy it into the index. So this tells you what you could git add, if you wanted.

  • git diff --cached or git diff --staged—the options have exactly the same meaning—tells Git: Compare the HEAD commit to the index. This time, Git does not look at your work-tree at all. It just finds out what's different between the current commit and the proposed next commit. That is, this is what would be different if you committed right now.

  • git diff HEAD (or more generally, git diff commit) tells Git: Compare what's in the commit I named, such as HEAD, to what's in the work-tree. This time, Git ignores your index, and just goes with the specific commit—such as HEAD—and the contents of the work-tree. This is not as useful as the HEAD-vs-index or index-vs-work-tree comparisons, but you can do it if you want.

There are, of course, more ways you might want to compare any two items, so git diff has a lot of options. But these are the main ones of interest at this point.

git status runs two git diffs

Note that the two most useful git diffs above, when you're actively developing, are git diff --cached, which tells you what would be different if you committed right now, and git diff with no options, which tells you what else could be different if you ran git add right now. The git status command, which you should use often, runs both of these diffs for you! It runs them with the --name-status flag set, internally, so that instead of showing the actual differences, it just shows the file's name if the file is changed.1

Let's see that again: git status runs two git diff commands. The first one is git diff --cached, i.e., what's different in the proposed commit. These are changes that are staged for commit. The second is a plain git diff, i.e., what's different in the index—the proposed commit—and the work-tree. These are changes that are not staged for commit.

So now you know what git status tells you, and when you would want to use git diff with or without --staged to see more than just the names of the files. Remember that the changes that git diff shows you are what Git is figuring out: the files inside the index, or in the work-tree, are full, complete copies. They just may be different from each other and/or different from the full, complete copy in HEAD.


1The "status" part of --name-status can instead say that a file is added—is in the index, but not in the HEAD commit, for instance. Or, in some cases, it can say that a file is renamed or has had some other auxiliary change, but let's not get into this here.

这篇关于为什么在git diff与git diff --staged下输出会有所不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆