Git:将文件历史记录从一个存储库复制到另一个存储库 [英] Git: Copy history of file from one repository to another

查看:75
本文介绍了Git:将文件历史记录从一个存储库复制到另一个存储库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个git仓库,分别是A和B,都包含一个名为file1.cc的文件. 是否可以将存储库A中file1.cc的历史记录合并/复制到存储库B中的file1.cc?

问题是我们已经将文件从存储库A移到存储库B,所有文件的历史记录都丢失了.但是现在,一些开发人员已经开始着手处理repo B并推动他们的更改.因此,现在我希望将某些文件从存储库A到存储库B的合并/复制历史记录仅适用于某些文件.有可能这样做吗?还是曾经丢失的文件历史会永远丢失?

请帮助.预先感谢.

解决方案

可以做到,但可能并不容易.但是首先要注意的是:没有移动文件的历史记录".只有移动的提交,因此,如果您想要代表文件子集历史的提交,那么创建这些提交是第一个挑战.

最简单的方法是传输所有历史记录. (实际上,如果碰巧您将Repo B作为Repo A的浅表克隆,则可以将其浅化并完成.但是我想那不是您创建Repo B的方式...)

无论如何,由于您正从存储库A迁移到存储库B,因此可能有一些您特别想删除的历史记录.这可能是一个完整的话题,但让我们假设您确实只想要几个文件的历史记录.

在特殊情况下,您想要的所有文件(没有其他文件)都在子目录中,并且您想要(或至少可以接受)将这些文件移动到存储库的根目录中,则可以使用--subdirectory-filter.

更一般地说,如果我们假设路径不应发生变化,并且您想要的文件可以在树中的任何位置,则可以将filter-branch--index-filter结合使用.

git filter-branch --index-filter 'git rm --cached --ignore-unmatch each file or *glob* you do NOT want' --prune-empty -- all

如果回购包含大量提交,则可能需要一段时间.如果rm的文件列表不是很简单,则可能要在外壳程序脚本中放置多个git rm命令,并将其用作--index-filter参数,而不是如上所示进行内联.

好吧,希望以某种方式,您有想要移植到回购协议B中的历史记录.

cd repo-b
git remote add repo-a path/to/repo-a
git fetch repo-a

现在您在回购B中了:

... A -- B <--(repo-a/master)
  \
   (repo-a/other-branches-maybe)

B' -- C -- D (master)(origin/master)

所以我在这里假设,来自回购A的最后一个master提交中的TREE是我们的历史记录重写的那个B,或者至少是该树的一部分.导入为回购B中的根提交.

现在,您有三个选择:重新设置父级,重新设置基础或替换

由于我认为最近的历史记录状态比旧的历史记录状态更重要,并且仅添加了旧的历史记录以供参考,因此最安全的方法是将C重置为B. (您可以选择将B'改为A,但我认为这没有太大的区别...)

因此从 https://git-scm.com上的filter-branch文档中提取/docs/git-filter-branch 您可以

# be sure you're on master
echo "$commit-id $graft-id" >> .git/info/grafts
git filter-branch $graft-id..HEAD

其中$commit-idB的SHA,$graft-idC

的SHA

重新建立基准可能会稍微简单一些(假设历史记录之间保持一定程度的一致性),但可能会导致您最终在D处修改树.如果您决定尝试重新设置基准,那将是

git rebase --onto repo-A/master B' master

其中,B'是存储库B根提交的SHA ID. (或者

git rebase --interactive --onto repo-A/master --root master

,然后删除B'的条目.)

这两个选项中的任何一个都将重写提交CD. (即使重新确定父级确保TREE不变,也仍将替换提交.)您的开发人员必须将其视为上游基准库(请参阅从上游基准库恢复"下的git rebase文档).为了减轻这种情况,我通常建议进行协调转换,在此情况下,开发人员将检查他们拥有的所有内容,丢弃其克隆,然后进行重写,然后从新的存储库中重新克隆它们.

如果要避免重写,可以使用第三个选项:git replace.已知这有一些怪癖,并且需要正确设置每个克隆才能看到"剪接的历史记录.

为此,您只需标记B(也许还有B'):

git tag old-history repo-a/master
git tag new-root B'

(其中B'是适当的SHA值ID或等效表达式).

当某人克隆该存储库时,他们只会看到新的历史记录,但是他们可以说

git replace new-root old-history

这将记录历史上的突破.

完成完父项,重新设置基础或进行替换后,您可以删除repo-a遥控器.

I've two git repositories say A and B, both contains a file named file1.cc. Is it possible to merge/copy the history of file1.cc in repo A to file1.cc in repo B?

The problem is we've already moved the files from repo A to repo B and the history of all the files are lost. but now some of the developers already started working on the repo B and pushed their changes. So now I want merge/copy history of some files from repo A to repo B and which are applicable only for some of the files. Is it possible to do so? Or the history of the files once lost is lost forever?

Please help. Thanks in advance.

解决方案

It can be done, but it may not be easy. But first things first: there is no "moving the history of a file". There is only moving commits, so if you want commits that represent the history of a subset of files, then creating those commits is the first challenge.

The simplest thing would be to transfer all history. (In fact, if it happens that you made Repo B as a shallow clone of Repo A, then you could just un-shallow it and be done. But I'm guessing that's not how you created Repo B...)

Regardless, since you're moving from Repo A to Repo B, maybe there's some history you specifically want to remove. That's potentially a whole topic of its own, but let's just assume you really want only the history of a few files.

In the special case where all the files you want (and no others) are in a subdirectory, and you want (or, at least, can accept) to move those files to the repo's root directory, you can use filter-branch with the --subdirectory-filter.

More generally, if we assume paths shouldn't change and that the files you want could be anywhere in the tree, then you could use filter-branch with an --index-filter.

git filter-branch --index-filter 'git rm --cached --ignore-unmatch each file or *glob* you do NOT want' --prune-empty -- all

That could take a while if the repo had a lot of commits. If the list of files to rm is not trivial, you may want to put multiple git rm commands in a shell script and use that as the --index-filter argument instead of inlining it as shown above.

Well, one way or other hopefully you've got a history you'd like to graft into Repo B.

cd repo-b
git remote add repo-a path/to/repo-a
git fetch repo-a

Now you have in Repo B:

... A -- B <--(repo-a/master)
  \
   (repo-a/other-branches-maybe)

B' -- C -- D (master)(origin/master)

So I'm making an assumption here, that the TREE from the last master commit in Repo A - the one from which our history rewrite created B - or at least some part of that tree, was imported as the root commit in Repo B.

Now you have three options: re-parent, rebase, or replace

Since I assume the recent history state is more important than the older-history state, and that the older history is just being added for reference, the safest thing would be to reparent C to B. (You could choose to reparent B' to A instead, but I'm assuming that doesn't make much difference...)

So drawing from the filter-branch docs at https://git-scm.com/docs/git-filter-branch you could

# be sure you're on master
echo "$commit-id $graft-id" >> .git/info/grafts
git filter-branch $graft-id..HEAD

where $commit-id is the SHA for B and $graft-id is the SHA for C

A rebase might be a little simpler (assuming a certain level of consistency between the histories) but introduces the possibility that you end up modifying the tree at D. If you do decide to try a rebase, it would be

git rebase --onto repo-A/master B' master

where B' is the Repo B root commit's SHA ID. (Alternately

git rebase --interactive --onto repo-A/master --root master

and then drop the entry for B'.)

Either of these options will rewrite commits C and D. (Even though re-parenting ensures the TREE is unchanged, the commits are still replaced.) Your developers would have to treat this as an upstream rebase (see the git rebase documentation under "recovering from upstream rebase"). To mitigate this, I generally recommend doing a coordinated cut-over where devs check in everything they have, discard their clones, then you do the rewrite and they re-clone from the new repo.

If you want to avoid the rewrite, you can use the third option: git replace. This is known to have a few quirks, and it requires each clone to be set up correctly in order to "see" the spliced history.

So to support this, you'd just tag B (and maybe also B'):

git tag old-history repo-a/master
git tag new-root B'

(where B' is the appropriate SHA value ID, or equivalent expression).

When someone clones the repo, they'll see only the new history, but they can say

git replace new-root old-history

and this will paper over the break in history.

Once you've done your reparent, rebase, or replace - you can remove the repo-a remote.

这篇关于Git:将文件历史记录从一个存储库复制到另一个存储库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆