大量git历史记录重写后如何同步本地历史记录? [英] How to sync local history after massive git history rewrite?

查看:176
本文介绍了大量git历史记录重写后如何同步本地历史记录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题似乎很奇怪,但是重写100次以上的提交后,我在同步git历史记录时遇到了问题.

The question may seem odd, but I have issues syncing git history after rewriting over 100 commits.

在我确实重写过的计算机上,一个简单的git fetch同步了所有内容.

On the machine I did rewrite from, a simple git fetch synced it all.

在另一台Mac机器上,git sync没有帮助,但是在随机删除本地.git/日志和引用文件然后发布git pull之后,历史记录得到了刷新.

On another mac machine, git sync did not help, but after a random deleting of local .git/ log and refs files and then issuing git pull, history got refreshed.

但是,无论我在Windows计算机上做什么,我都无法刷新项目历史记录.都试过了:

However, no matter what I do on the Windows machine, I cannot refresh project history. Tried it all:

  • git reset --hard HEAD& git fetch
  • git fetch --all
  • git pull
  • git reset --hard HEAD & git fetch
  • git fetch --all
  • git pull
  • etc

每次在Windows机器上,我都会获得与另一作者相同的提交的重复条目(更改了Author字段).

Each time on Windows machines, I get duplicated entries (I changed Author fields) of the same commit with a different author.

我使用本教程重写了大量历史记录:

I followed massive history rewrite using this tutorial:

https://help.github.com/articles/changing-author-信息/

Open Terminal.

Create a fresh, bare clone of your repository:

git clone --bare https://github.com/user/repo.git
cd repo.git
Copy and paste the script, replacing the following variables based on the information you gathered:

OLD_EMAIL
CORRECT_NAME
CORRECT_EMAIL

#!/bin/sh

git filter-branch --env-filter '
OLD_EMAIL="your-old-email@example.com"
CORRECT_NAME="Your Correct Name"
CORRECT_EMAIL="your-correct-email@example.com"
if [ "$GIT_COMMITTER_EMAIL" = "$OLD_EMAIL" ]
then
    export GIT_COMMITTER_NAME="$CORRECT_NAME"
    export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
fi
if [ "$GIT_AUTHOR_EMAIL" = "$OLD_EMAIL" ]
then
    export GIT_AUTHOR_NAME="$CORRECT_NAME"
    export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi
' --tag-name-filter cat -- --branches --tags
view rawgit-author-rewrite.sh hosted with ❤ by GitHub
Press Enter to run the script.
Review the new Git history for errors.
Push the corrected history to GitHub:

git push --force --tags origin 'refs/heads/*'
Clean up the temporary clone:

cd ..
rm -rf repo.git

有人在重写大量git历史记录方面有经验吗?如果是,其他团队成员刷新其git历史记录的步骤是什么?

Has anyone experience with massive git history rewrite? If yes, what are the steps for other team members to refresh their git history?

推荐答案

理解此处问题的关键是在Git中:

The key (or keys) to understanding the issues here is (are) that, in Git:

  • 提交历史.
  • 任何提交的真实名称"是其哈希ID.
  • 永远都不能更改提交.
  • 每个提交都通过哈希ID记住其先前(直接祖先,也称为 parent )提交.
  • 名称(包括分支名称和标记名称)主要仅存储一(1)个哈希ID.
  • 分支名称​​ 的特殊属性是,随着分支的增长,它更改其存储的哈希ID ,通常以一种不错"的方式进行,这样无论提交什么今天的分支名称,该提交(通过哈希ID)最终导致该名称昨天确认的提交(通过哈希ID).
  • Commits are the history.
  • The "true name" of any commit is its hash ID.
  • No commit can ever be changed.
  • Each commit remembers its previous (immediate ancestor, aka parent) commit(s) by hash ID.
  • Names, including branch and tag names, mainly just store one (1) hash ID.
  • The special property of a branch name is that it changes which hash ID it stores, as the branch grows, normally in a "nice" manner so that whatever commit the branch names today, that commit (by hash ID) eventually leads back to the commit (by hash ID) that the name identified yesterday.

重写历史记录"时,您不会-您不能-更改任何现有的提交.相反,您复制每个现有的提交. git filter-branch的作用是将您请求的所有提交以最旧的"(最原始的)复制到最新的"(最原始/最尖端的)顺序,并按顺序应用过滤器:

When you "rewrite history", you do not—you can not—change any existing commit. Instead, you copy every existing commit. What git filter-branch does is to copy all the commits you request, in "oldest" (most ancestral) to "newest" (least ancestral / tip-most) order, applying filters as it goes:

  • 提取原始提交;
  • 应用过滤器;
  • 根据结果进行新的提交,其中父哈希ID的更改由任何先前的副本决定.

最后,这意味着真正进行大量重写,实际上是您并排放置了两个不同的存储库:旧的具有旧提交的存储库,以及新的带有新提交的存储库.提交.在筛选过程结束时,git filter-branch更改名称以指向新副本.

In the end, what this means for a really massive rewrite is that you have, in essence, two different repositories placed side-by-side: the old one, with its old commits, and the new one, with its new commits. At the end of the filtering process, git filter-branch changes the names to point to the new copies.

如果您有一个只有三个提交的小型存储库(我们称它们为通过CC的提交)和一个master分支,并且所有三个提交都需要进行一些更改,那么您将拥有:/p>

If you had a tiny repository with just three commits—let's call them commits A through C—and one master branch, and all three commits needed some change(s), you would have this:

A--B--C   [was the original master]

A'-B'-C'  <-- master

从字面上看,新提交是 new 提交.仍在使用旧提交的任何人实际上仍在使用旧提交.他们必须停止使用这些提交并开始使用新的提交.

The new commits are, literally, new commits. Anyone still using the old commits is literally still using the old commits. They must stop using those commits and start, instead, using the new commits.

在某些情况下,您使用git filter-branch指定的过滤器最终将完全不更改原始提交中的任何内容.在这种情况下,如果filter-branch写入的 new 提交与原始提交逐位相同,则只有到那时,新的提交才实际上与旧的提交相同.如果我们查看相同的三提交原始存储库,但选择仅修改第二个B提交内容或元数据的过滤器,则会得到:

In some cases, the filter(s) you specify with git filter-branch wind up not changing anything at all in an original commit. In this case—if the new commit that filter-branch writes is bit-for-bit identical to the original commit—then, and only then, the new commit is actually the same as the old commit. If we look at this same three-commit original repository, but choose a filter that modifies the content or metadata of only the second B commit, we get instead:

A--B--C
 \
  B'-C'  <-- master

作为最终结果.

请注意,这会发生,即使原始C的任何内容都没有通过过滤进行更改.这是因为有关原始B 的内容已更改,从而导致了新的和不同的提交B'.因此,当git filter-branch复制C时,它必须做一个 更改:复制C'的父对象是新的B',而不是原始的B.

Note that this occurs even though nothing about original C was changed by the filtering. This is because something about original B was changed, resulting in new-and-different commit B'. Hence, when git filter-branch copied C, it had to make one change: the parent of the copy C' is the new B' rather than the original B.

也就是说,git filter-branchA复制到了一个新的提交,但根本没有做任何更改(甚至没有对任何父信息进行任何更改),因此新的提交原来是对原始A的重用. .然后它将B复制到新提交,并进行更改,因此新提交现在为B'.然后,它在不作任何更改的情况下复制了C,将父级更改为B',并编写了新的提交C'.

That is, git filter-branch copied A to a new commit, but made no change at all (not even to any parent information), so the new commit turned out to be a re-use of original A. Then it copied B to a new commit, and made a change, so the new commit is now B'. Then it copied C without making changes, changed the parent to B', and wrote new commit C'.

如果您的过滤器仅更改了C,则git filter-branch命令会将A复制到其自身,将B复制到其自身,并且将C复制到C',从而给出:

If your filter made a change only to C, the git filter-branch command would copy A to itself, B to itself, and C to C', giving:

A--B--C
    \
     C'  <-- master

处理上游重写

通常,人们处理大量上游origin重写的最简单方法是丢弃它们现有存储库.也就是说,我们希望共享的原始提交不超过几个:在大规模重写的某个早期阶段,我们更改提交A或附近的提交,因此每个后续提交都必须复制到一个新的提交中.因此,创建一个 new 克隆可能不会比更新现有克隆昂贵得多. 更轻松!

Dealing with an upstream rewrite

In general, the easiest way for people to deal with a really massive upstream origin rewrite is for them to discard their existing repositories entirely. That is, we'd expect to share no more than a few original commits: at some early point in the massive rewrite, we change commit A or one near it, so that every subsequent commit has to be copied to a new commit. Thus, creating a new clone is probably not much if any more expensive than updating an existing one. It's certainly easier!

严格来讲,这不是必需的.作为下游"使用者,我们可以运行git fetch并获取具有其更新的分支名称和更新的标记的所有新提交(此处特别小心,因为标记默认情况下不会更新).但是,由于我们有我们自己的分支名称,它们指向原始提交而不是新复制的提交,因此我们现在必须使每个我们的分支名称都引用新的-复制的提交,也许还复制我们上游没有的任何提交(因此也没有复制).

This is not, strictly speaking, necessary. As a "downstream" consumer, we can run git fetch and obtain all the new commits with their updated branch names, and perhaps updated tags (be especially careful here as tags won't update by default). But since we have our own branch names, pointing to the original commits and not the newly-copied commits, we must now make each of our branch names refer to the newly-copied commits, perhaps also copying any commits that we have that the upstream did not have (and hence did not already copy).

换句话说,我们可以为每个分支运行:

In other words, we could, for each of our branches, run:

git checkout <branch>
git reset --hard origin/<branch>

命名我们的 branch 名称,作为其尖端提交,与origin/branch命名的提交相同. (请记住,git fetch强制更新所有我们的 origin/branch名称,以匹配 branch origin上指向的哈希ID.)

to make our branch name, as its tip commit, the same commit that origin/branch names. (Remember, git fetch force-updates all of our origin/branch names to match the hash ID to which branch points on origin.)

这等效于删除每个分支,并使用git checkout重新创建它们.换句话说,它不会继承任何重写了origin的人都不会复制的我们的承诺(因为他们没有,因为他们没有,所以不能复制).为了结转我们的提交,我们必须做与利用上游基础进行交易相同的操作.内置的分叉点代码是否会为您正确地做到这一点(如果您的Git至少为2.0,通常会做到这一点)确实是针对一个单独的问题(并且已经在其他地方得到了回答). 请注意,您必须对希望继续进行提交的每个分支执行此操作.

This is equivalent to deleting each of our branches and using git checkout to re-create them. In other words, it won't carry forward any of our commits that whoever rewrote origin did not copy (because they couldn't because they didn't have them). To carry forward our commits, we must do the same thing we would to deal with an upstream rebase. Whether the built-in fork-point code will do that correctly for you—it often will if your Git is at least 2.0—is really for a separate question (and has been answered elsewhere already). Note that you will have to do this for each branch in which you have commits you wish to carry forward.

这篇关于大量git历史记录重写后如何同步本地历史记录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆