如何恢复从远程本地存储库中删除的文件,以及将来如何防止这种丢失? [英] How can I recover files deleted from both my remote an local repositories and how can I prevent this loss in the future?

查看:69
本文介绍了如何恢复从远程本地存储库中删除的文件,以及将来如何防止这种丢失?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,我意外地将我所有的项目文件都推送到github仓库中,而没有创建.gitignore文件,然后在事实发生后将其添加到github中,并删除了在最初的推送中会被忽略的文件(如果有.gitignore文件)存在.完成此操作后,我将存储库拉到本地git,以为我只会得到.gitignore文件,但是所有对开发很重要的文件(.project,.classpath,*.jar等)都将被忽略.我当时在日食中被删除了.

如何恢复这些丢失的文件,以及将来如何在不删除文件的情况下添加.gitignore文件.

感谢您的帮助.

解决方案

我从注释中看到您已经将文件放回去,这很好,这意味着我不必为此编写特定的说明(尽管我会有一节).但是:了解 .gitignore 的功能很重要.它实际上并不会导致文件被忽略!(这意味着 .gitignore 并不是一个很好的名称,但是 good 的名称确实很长.可能应该将其命名为 .gitxyz .gitconfuse 之类的东西,以使其简短而令人难忘,但是您不认为它的意思是忽略".)

那么, .gitignore 到底是做什么的呢?好吧,这需要快速深入到Git的其他部分,许多教程或介绍都掩盖了这部分,这引起了很多混乱,这就是Git的 index .

背景

首先,让我们注意Git主要要做的是存储 commits .提交可以保存您所有文件的完整快照,包括提交所包含的所有文件,但这有点多余:提交具有提交所具有的内容."这里的重要部分是每个文件都有一个完整副本,这些文件被冻结且不可变,可以永久保存(或者至少在提交本身继续存在的情况下保存).

(提交还包含一些元数据,例如一些信息,例如您的姓名和作为作者/提交人的电子邮件地址,以及您的日志消息.最重要的是,每个提交还都包含上一个的真实名称(哈希ID)或 parent 提交.但我们在这里不做讨论.)

这些冻结的文件在每次提交时永久且不变地保存,如果不压缩它们将占用大量空间,因此它们采用特殊的,压缩的,仅Git的形式.实际上,由于 是冻结的,因此如果两个不同的提交对文件使用相同的数据,则这两个提交实际上是 share 冻结的副本.这意味着Git始终将文件的相同副本放入每个新提交中,这一事实根本不占用空间,因为它实际上只是在重用旧文件.但是无论如何,这些文件对所有 Git都没有用:没有其他系统可以直接读取它们,甚至Git都没有写东西:它们被冻结了.

因此,要让您查看和处理文件,Git必须将保存在某些提交中的所有文件解冻到某种工作区中.Git将此区域称为工作树工作树.这是一个好名字,因为它是您工作的地方.

其他非Git版本控制系统通常在此处停止:它们具有已提交的文件(可能存储为增量而不是完整文件),工作树,仅此而已.当您使用这些系统之一并进行一次 new 提交时,这会花费一些时间,有时会花费很多时间.有时,您可以在等待时出去吃午饭.不过,使用Git,您可以运行 git commit -m message 和- zip -完成.

Git从其 index 获得所有这些速度.但是 index 是这个名称的可怕名称,因此Git也将其称为暂存区,有时也称为 cache ,具体取决于谁/什么正在打电话.索引的作用是保留(以一种特殊的仅Git形式,但这次不冻结),所有要保存在 next 提交中的文件.

最初,索引是从您签出的所有提交中填充的.也就是说, git checkout< some-commit-specifier> 找到包含完整冻结文件集的提交.Git将冻结的文件(以及指向其内容的链接)复制到索引中,一路找出文件的全名,以便索引包含Git需要放入工作树中的所有文件的列表..这些现在采用特殊的仅Git格式,但未冻结.然后,Git还将文件放入工作树,将其扩展为有用的格式.

最终结果是索引与提交匹配,但是索引未冻结.工作树匹配提交和索引 ,当然它们是未冻结的,文件具有其有用的形式.现在,您可以像往常一样进行工作了,这解释了为什么为什么必须一直 git add 您的文件!

git add 的作用是将工作树文件复制到索引中.如果文件已在索引中,则这将覆盖先前的副本.现在,新副本为仅Git格式(但仍未冻结).如果文件之前没有在索引中,则现在为.无论哪种情况,索引仍然可以使用.除了收集您的姓名,电子邮件和日志消息之类的元数据外,所有 git commit 所要做的就是冻结索引.

因此,对于索引,我知道的最好的简短描述是:索引包含您建议的 next 提交.其中包含所有文件.特殊的仅Git形式,但尚未冻结.这就是为什么它也称为 staging区域:的原因,它具有所有文件,已暂存并可以使用.

分段,未分段,未跟踪,被忽略

现在您知道每个文件都有三个副本,您可以担心,所有这些都将变得有意义.例如,让我们考虑一个 README.txt 文件.您运行 git checkout master 来启动,Git会找到 master 的提交,然后将其检出,从而使该提交成为当前提交或 HEAD 提交:

  • HEAD:README.txt 被冻结在当前提交中.它永远不会改变-这是该提交的一部分.

  • :README.txt 被复制到索引中,并在此过程中解冻.它可能会更改,但当前与 HEAD:README.txt 匹配.

  • :README.txt 从索引复制到工作树,扩展为有用的形式.它可能会更改,但当前与:README.txt 匹配.

所有三个副本都匹配,因此Git对该文件一言不发.

如果现在更改工作树副本并运行 git status ,则Git的 status 命令会比较 HEAD 和索引副本.它们是相同的,因此对此一无所知.它将索引和工作树副本进行比较,它们是不同的,因此 git status 表示该文件未上演提交.

运行 git add README.txt 后,它将工作树版本复制(并压缩)到:README.txt 中.现在,这两个匹配,但是 HEAD:README.txt :README.txt 不同.因此, git status HEAD 与索引进行比较,并说该文件已准备提交.

请注意,您可以再次更改工作树副本.现在,文件在所有三个版本中都不相同,并且 git status 告诉您,它们都已暂存为提交(HEAD和索引不匹配),也未暂存为提交(索引和工作树都不匹配).这全部基于两个 git diff 的结果:一个从 HEAD 到索引,一个从索引到工作树.

但是如果您在 HEAD 提交中有一个文件从索引和工作树中删除,该怎么办?好了,现在在比较 HEAD 与索引时,已删除.因此,Git表示已执行删除操作.索引和工作树匹配,因此Git对此一无所知.无论如何,您的下一个提交不会拥有文件.

如果您的文件在工作树中但不在索引中,该怎么办?如果它在 HEAD 提交中,则它仍是分阶段删除:文件不会在下一次提交中.但这在索引和工作树中也有所不同,因此它是未跟踪的.

如果该文件不在 HEAD 中并且不在索引中,但是在工作树中,则该文件为未跟踪.

这告诉我们拥有一个未跟踪文件的含义:只有当文件不在当前索引中时,该文件才会被跟踪.由于您可以操纵索引(随时随地添加或删除文件),因此您可以随时更改,只需将其添加到索引中,或将其从索引中删除即可.索引.

如果文件不在索引中并且不在工作树中,则该文件不存在.只有 在工作树中,而不是在索引中,这些文件才是未跟踪的.您可以 git add 该文件,然后Git对您发牢骚,困扰您.在 .gitignore (或 .git/info/exclude )中列出文件主要是关闭Git up .它实际上并不会导致文件被取消跟踪,这与文件是否在索引中有关.一旦文件进入索引,就将对其进行跟踪,而 .gitignore 无效.它只是防止 git status 困扰您.因此,代替 .gitignore ,也许应该是 .git-dont-complain-about-these这些文件(如果它们未被跟踪).

它还具有另一个重要作用.您可以运行 git add. git add somedir git add --all 来添加一系列基于Git搜索的文件.目录/文件夹中文件的完整列表.如果您列出了一些被忽略的文件,如果尚未跟踪,则 git add 会跳过它们.也就是说,跟踪的文件肯定在Git中,因此,如果更改了 git add ,它将把它复制到索引中.但是 untracked 文件还不在Git中,因此,如果不忽略该文件,则会进行en-masse添加.在这里,忽略"是正确的词.因此,也许该文件应被称为 .git-dont-complain-about-these-files-if-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-the-en-the-tracked.-操作.

不幸的是,在 .gitignore 中列出文件还有另一个副作用,那就是您告诉Git在某些情况下删除或破坏文件是可以的.因此, .gitignore 的全名可能是 .git-dont-complain-about-these-这些文件,如果它们未被跟踪并且不自动添加它们进行大量添加操作,但有时有时会感觉到这些文件没有语音提示.想象一下,如果那是文件名!至少不会那么混乱.

使用 git rm --cached

删除文件的缺点

正如我们在上面看到的,如果要使某些文件不被跟踪,则必须将其从索引中删除.如果您使用 git rm --cached filename ,Git将从索引中删除文件(因此现在未跟踪),但是 not 删除工作树中的文件(因此仍然可以在其中使用它).您的下一个提交不会拥有您想要的文件.

但是所有 old 提交,永久冻结的时间提交, do 都有文件...所有那些旧提交仍然存在.如果您检出了其中一项提交,Git将不得不将冻结的文件复制到索引中,然后将索引副本复制到工作树中.那会破坏您的工作树版本.那样可以么?Git的答案是检查 .gitignore 中的文件!

如果 .gitignore 中未列出文件 ,则Git将无法随意对其进行破坏.但是您会抱怨它没有被追踪.为了解决这些问题,您可能会在 .gitignore 中列出文件.然后,签出旧的提交将使用提取的,未冻结的,未压缩的文件破坏您的文件,然后回到新的提交删除文件,因为它们现在与冻结的文件相同一个,所以它是安全的".

Git需要(但没有)一种方法来列出工作树文件,因为对此进行了关闭,但从不对其进行破坏.如果何时添加,将使您处理自己的情况.但是,如果可能的话,最好不要完全陷入困境.

恢复您的力量

同时,如果您丢失文件,请记住至少在旧的冻结提交中有一些版本.您可以提取那些旧的冻结版本.有两种方法可以做到这一点:

  1. git show :运行 git show commit : path ,例如 gitshow v1.0:README.txt git show a123456:path/to/file.ext .这会将冻结的保存文件扩展为标准输出,因此您可以使用I/O重定向保存它: git show v1.0:README.txt>例如README.txt.old .

  2. git checkout 具有一种模式,该模式不检出整个提交,而是从一些现有提交中填充索引和工作树的 part .(此命令可能永远不应该被称为 git checkout ,因为如果您尚未保存更改,它会具有破坏性,请谨慎操作.)运行 git checkout v1.0-README.txt git checkout a123456-path/to/file.ext 将从命名提交中提取命名文件,并将其复制(解冻)到索引中-因此现在您的下一次提交将具有那个文件的那个版本,然后进入您的工作树.

    如果您有要恢复的文件的整个目录,或者是 glob模式(例如 *.jar ),则此选项会更有用,因为您可以 git checkout目录或模式:

      git checkout HEAD〜2-'* .jar' 

    此处 HEAD〜2 是要使用的提交(从当前提交沿第一个父级链返回两步)和 *.jar ,需要引用如果当前目录中有任何 *.jar 文件,则可以从外壳保护它,这是Git应该匹配的 pathspec .(我认为这应该等效于 **/*.jar ,但如果不是,那也是有效的pathspec.)由于这会填充索引,因此您之后必须撤消它,例如,<再次使用 git rm --cached git reset (这也需要路径说明,因此您可以 git reset-'* .jar').

这些冻结的文件是否足以满足您的当前情况,当然取决于情况.

I recently, accidentaly pushed all my project files to a github repo without making a .gitignore file and then added it to github after the fact and deleted the files that would have been ignored in the initial push, had the .gitignore file existed. After doing this I pulled the repo to my local git thinking that I would only get the .gitignore file, however all the files to be ignored (.project, .classpath, *.jar, etc.), that are important for the development I was doing in eclipse were deleted.

How can I recover these lost files and how can I go about adding the .gitignore file in the future without deleting the files.

Thanks for the help.

解决方案

I see from comments that you already have your files back, which is good and means I don't have to write specific instructions for that (though I'll have a section on it). But: It's important to understand what .gitignore does. It doesn't actually cause files to be ignored! (Which means that .gitignore is not a very good name, but a good name would be really long. Probably it should have been called .gitxyz or .gitconfuse or something, so that it's short and memorable, but you don't think it means "ignore".)

So, what does .gitignore really do? Well, this requires a quick dive into the other part of Git that many tutorials or introductions gloss over, that causes a lot of confusion, and that is Git's index.

Background

First, let's note that what Git mainly does is to store commits. A commit holds a complete snapshot of all of your files—well, all the ones that the commit contains, but that's a bit redundant: "The commit has what the commit has." The important part here is that there's a full copy of every file, frozen and immutable, saved forever (or at least as long as the commit itself continues to exist).

(The commit also contains some metadata—some information like your name and email address as the author/committer, for instance, and your log message. Crucially, each commit also contains the true name—the hash ID—of the previous or parent commit. But we won't go into that here.)

These frozen files, saved permanently and immutably with each commit, would use up a lot of space if they were not compressed, so they are in a special, compressed, Git-only form. And in fact, since they are frozen, if two different commits use the same data for a file, those two commits actually share the frozen copy. This means that the fact that Git keeps putting the same copy of the file into every new commit, takes no space at all, because it's actually just re-using the old file. But in any case, these files are not useful to anything except Git: no other system can read them directly, and nothing—not even Git—can write on them: they're frozen.

So, to let you see and work on your files, Git has to un-freeze all the files that are saved in some commit, into some sort of work area. Git calls this area the work-tree or working tree. That's a good name, because it's where you do your work.

Other, non-Git, version control systems typically stop here: they have the committed files (maybe stored as deltas instead of full files), and the work-tree, and that's it. When you use one of these systems and go to make a new commit, it takes time, sometimes a whole lot of time. Sometimes you could go out and get lunch while you wait. With Git, though, you run git commit -m message and—zip—it's done.

Git gets all this speed from its index. But index is a terrible name for this thing, so Git also calls it the staging area, or sometimes the cache, depending on who / what is doing the calling. What the index does is hold—in a special Git-only form, but this time not frozen—all the files that are going to be in the next commit.

Initially, the index is filled from whichever commit you check out. That is, git checkout <some-commit-specifier> locates the commit, which contains the full set of frozen files. Git copies the frozen files (well, the link to their content) into the index, figuring out the files' full names along the way, so that the index has the list of all the files that Git needs to put in the work-tree. These are now in the special Git-only format, but unfrozen. Git then also puts the files into the work-tree, expanding them into useful format.

The end result is that the index matches the commit, but the index is unfrozen. The work-tree matches both the commit and the index, and of course is unfrozen and files have their useful form. You now do your work as usual—and, this explains why you have to git add your files all the time!

What git add does is to copy the work-tree file into the index. This overwrites the previous copy, if the file was already in the index. The new copy is now in the Git-only format (but still not yet frozen). If the file wasn't in the index before, now it is. In either case, the index is still ready to go. All git commit has to do, besides collecting the metadata like your name and email and log message, is freeze the index.

Hence, the best short description I know of for the index is this: The index contains your proposed next commit. It has all the files in it, in their special Git-only form, but not yet frozen. This is why it's also called the staging area: it has all the files in it, staged and ready to go.

Staged, unstaged, untracked, ignored

Now that you know that there are three copies of each file to worry about, all of this will start to make sense. Let's consider a README.txt file, for instance. You run git checkout master to start, and Git finds the commit for master and checks it out, making that commit the current or HEAD commit:

  • HEAD:README.txt is frozen in the current commit. It will never change—it's part of that commit.

  • :README.txt is copied into the index, and unfrozen in the process. It could change, but it currently matches HEAD:README.txt.

  • :README.txt is copied from the index to the work-tree, expanded into useful form. It could change, but it currently matches :README.txt.

All three copies match, so Git says nothing at all about the file.

If you now change the work-tree copy and run git status, Git's status command compares the HEAD and index copies. They are the same, so it says nothing about this. It compares the index and work-tree copies, and they are different, so git status says the file is not staged for commit.

Once you run git add README.txt, that copies (and compresses) the work-tree version into :README.txt. Now these two match, but HEAD:README.txt and :README.txt are different. So git status compares HEAD vs index and says that the file is staged for commit.

Note that you can change the work-tree copy yet again. Now the file is different in all three versions, and git status tells you that it's both staged for commit (HEAD and index don't match) and not staged for commit (index and work-tree don't match either). This is all based on the result of two git diffs: one from HEAD to index, and one from index to work-tree.

But what happens if you have a file that's in the HEAD commit that you remove from the index and work-tree? Well, now it's removed when comparing HEAD vs index. So Git says that a remove is staged. The index and work-tree match, so Git says nothing about that. In any case, your next commit won't have the file.

What happens if you have a file that's in the work-tree, but not in the index? If it's in the HEAD commit it's still a staged remove: the file won't be in the next commit. But it's also different in the index and work-tree, so it's untracked.

If the file isn't in HEAD and is not in the index, but is in the work-tree, it's untracked.

This tells us what it means to have an untracked file: a file is untracked if and only if it's not in the index right now. Since you can manipulate the index—adding or removing files whenever you like—you can change the tracked-ness of some file at any time, by just adding it to the index, or taking it out of the index.

If a file isn't in the index and isn't in the work-tree, it just doesn't exist. It's only files that are in the work-tree, but are not in the index, that are untracked. You could git add the file, and Git whines at you, nagging you about the file. Listing the file in .gitignore (or in .git/info/exclude) mainly shuts Git up. It doesn't actually cause the file to be untracked—that's a matter of whether the file is in the index. Once the file is in the index, it's tracked, and .gitignore has no effect. It just keeps git status from nagging you. So instead of .gitignore, maybe this should be .git-dont-complain-about-these-files-if-they-are-untracked.

It also has one other important effect. You can run git add . or git add somedir or git add --all to add a whole bunch of files based on Git searching through the entire list of files in a directory / folder. If you list some files as ignored, git add will skip them if they're not already tracked. That is, a tracked file is definitely in Git, so git add will copy it into the index if it's changed. But an untracked file isn't in Git yet, so an en-masse add will add it if it's not ignored. Here, "ignore" is the right word. So maybe the file should be called .git-dont-complain-about-these-files-if-they-are-untracked-and-dont-auto-add-them-with-an-en-masse-add-operation.

Unfortunately, there's another side effect of listing a file in .gitignore, and that is that you tell Git that it's OK to remove or clobber the file in some cases. So the full proper name for .gitignore might be .git-dont-complain-about-these-files-if-they-are-untracked-and-dont-auto-add-them-with-an-en-masse-add-operation-but-do-feel-free-to-clobber-these-files-sometimes. Imagine if that were the file's name! It would be less confusing, at least.

Drawbacks of removing a file with git rm --cached

As we saw above, if you want to make some file untracked, you must take it out of the index. If you use git rm --cached filename, Git will take the file out of the index (so now it's untracked), but not remove the file from the work-tree (so it's still where you can use it). Your next commit won't have the file, which is what you want.

But all the old commits, the frozen forever in time commits, that do have the file ... all those old commits still exist. If you ever check out one of those commits, Git will have to copy the frozen file into the index, and then copy the index copy into the work-tree. That will clobber your work-tree version. Is that OK? Git's answer to that is to check for the file in .gitignore!

If the file isn't listed in a .gitignore, Git won't feel free to clobber it. But you'll get complaints about it being untracked. To solve those complaints, you'll probably list the file in .gitignore. Then, checking out an old commit will clobber your files with the extracted, unfrozen, uncompressed ones—and then going back to the new commit will remove the files, because they're now the same as the frozen ones, so it's "safe".

Git needs, but does not have, a way to list a work-tree file as shut up about this, but never clobber it. If and when this is ever added, that will let you handle your situation. But it's best not to get into the situation at all, if possible.

Recovering what you can

Meanwhile, if you do lose your files, remember that at least there's some version(s) of them in the old frozen commits. You can extract those old frozen versions. There are a couple of ways to do this:

  1. git show: run git show commit:path, e.g., git show v1.0:README.txt or git show a123456:path/to/file.ext. This expands the frozen, saved file to standard output, so you can save it with I/O redirection: git show v1.0:README.txt > README.txt.old, for instance.

  2. git checkout has a mode where, instead of checking out a whole commit, it populates part of your index and work-tree from some existing commit. (This command probably should never have been called git checkout, since it's quite destructive if you have unsaved changes, so be careful with it.) Running git checkout v1.0 -- README.txt or git checkout a123456 -- path/to/file.ext will extract the named file from the named commit, copying (unfreezing) it into your index—so now your next commit will have that version of that file—and then on into your work-tree.

    This one is more useful if you have a whole directory of files to recover, or a glob pattern like *.jar, because you can git checkout the directory or pattern:

    git checkout HEAD~2 -- '*.jar'
    

    Here HEAD~2 is the commit to use (two steps back from the current commit along the first-parent chains), and *.jar, which needs quoting to protect it from the shell if there are any *.jar files in the current directory, is the pathspec that Git should match. (I think this should be equivalent to **/*.jar but if not, that is also a valid pathspec.) Since this populates your index, you will have to undo it afterward, e.g., git rm --cached again, or git reset (which also takes pathspecs, so you can git reset -- '*.jar').

Whether these frozen files are sufficient for your current situation is, of course, situation-dependent.

这篇关于如何恢复从远程本地存储库中删除的文件,以及将来如何防止这种丢失?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆