如何将 Git 存储库合并为线性历史记录? [英] How can I combine Git repositories into a linear history?
问题描述
我有两个 git 存储库 R1
和 R2
,其中包含提交从产品发展的两个时期来看:1995-1997 和 1999-2013.(我通过将现有的 RCS 和 CVS 存储库转换为 Git 来创建它们.)
I have two git repositories R1
and R2
, which contain commits
from two periods of a product's development: 1995-1997 and 1999-2013.
(I created them by converting existing RCS and CVS repositories into Git.)
R1:
A---B---C---D
R2:
K---L---M---N
如何将两个存储库合并为一个包含项目线性历史的准确视图?
How can I combine the two repositories into a single one that contains an accurate view of the project's linear history?
A---B---C---D---K---L---M---N
注意R1
和R2
之间的文件已经添加、删除和重命名.
Note that between R1
and R2
files have been added, deleted, and renamed.
我尝试创建一个空的存储库,然后合并它们的内容到它.
I tried creating an empty repository and then merging their contents onto it.
git remote add R1 /vol/R1.git
git fetch R1
git remote add R2 /vol/R2.git
git fetch R2
git merge --strategy=recursive --strategy-option=theirs R1
git merge --strategy=recursive --strategy-option=theirs R2
然而,这留下了修订版 D
中的最终文件,但不在修订版 K
中.我可以制作一个合成提交来删除合并之间的额外文件,但这对我来说似乎不雅.此外,通过这种方法,最终结果包含合并实际上并没有发生.
However, this leaves in the end files that were in revision D
,
but not in revision K
.
I could craft a synthetic commit to remove the extra files between the merges,
but this seems inelegant to me.
Furthermore, through this approach the end-result contains merges that
didn't actually occur.
推荐答案
使用 git filter-branch
直接使用 git 中的技巧-filter-branch 手册页:
首先,创建一个新的存储库,将两个原始存储库作为远程存储库,就像您之前所做的那样.我假设两者都使用分支名称master".
First, create a new repository with the two original ones as remotes, just as you did before. I am assuming that both use the branch name "master".
git init repo
cd repo
git remote add R1 /vol/R1.git
git fetch R1
git remote add R2 /vol/R2.git
git fetch R2
接下来,将master"(当前分支)指向 R2 的master"的尖端.
Next, point "master" (the current branch) to the tip of R2's "master".
git reset --hard R2/master
现在我们可以将R1的主人"的历史嫁接到开始.
Now we can graft the history of R1's "master" to the beginning.
git filter-branch --parent-filter 'sed "s_^$_-p R1/master_"' HEAD
换句话说,我们在 D
和 K
之间插入了一个假的父提交,所以新的历史记录看起来像:
In other words, we are inserting a fake parent commit between D
and K
so the new history looks like:
A---B---C---D---K---L---M---N
K
到 N
的唯一变化是 K
的父指针发生了变化,因此所有的 SHA-1 标识符都发生了变化.提交消息、作者、时间戳等保持不变.
The only change to K
through N
is that K
's parent pointer changes, and thus all of the SHA-1 identifiers change. The commit message, author, timestamp, etc., stay the same.
如果您有两个以上的存储库要做,比如 R1(最旧)到 R5(最新),只需重复 git reset
和 git filter-branch
命令时间顺序.
If you have more than two repositories to do, say R1 (oldest) through R5 (newest), just repeat the git reset
and git filter-branch
commands in chronological order.
PARENT_REPO=R1
for CHILD_REPO in R2 R3 R4 R5; do
git reset --hard $CHILD_REPO/master
git filter-branch --parent-filter 'sed "s_^$_-p '$PARENT_REPO/master'"' HEAD
PARENT_REPO=$CHILD_REPO
done
使用移植物
作为对 filter-branch
使用 --parent-filter
选项的替代方法,您可以使用 grafts 机制.
Using grafts
As an alternative to using the --parent-filter
option to filter-branch
, you may instead use the grafts mechanism.
考虑将 R2/master
附加为 R1/master
的子代(即更新于)的原始情况.和以前一样,首先将当前分支 (master
) 指向 R2/master
的尖端.
Consider the original situation of appending R2/master
as a child of (that is, newer than) R1/master
. As before, start by pointing the current branch (master
) to the tip of R2/master
.
git reset --hard R2/master
现在,不是运行 filter-branch
命令,而是在 .git/info/grafts
中创建一个graft"(假父)来链接 R2/master
(K
) 的root"(最旧)提交到R1/master
(D
) 中的提示(最新)提交.(如果R2/master
有多个根,下面只链接其中一个.)
Now, instead of running the filter-branch
command, create a "graft" (fake parent) in .git/info/grafts
that links the "root" (oldest) commit of R2/master
(K
) to the tip (newest) commit in R1/master
(D
). (If there are multiple roots of R2/master
, the following will only link one of them.)
ROOT_OF_R2=$(git rev-list R2/master | tail -n 1)
TIP_OF_R1=$(git rev-parse R1/master)
echo $ROOT_OF_R2 $TIP_OF_R1 >> .git/info/grafts
此时,您可以查看您的历史记录(例如,通过 gitk
),看看它是否正确.如果是这样,您可以通过以下方式永久更改:
At this point, you can look at your history (say, through gitk
) to see if it looks right. If so, you can make the changes permanent via:
git filter-branch
最后,您可以通过删除移植文件来清理所有内容.
Finally, you can clean everything up by removing the graft file.
rm .git/info/grafts
使用移植可能比使用--parent-filter
的工作量更大,但它的优点是能够用一个filter-branch<将两个以上的历史嫁接在一起/代码>.(你可以用
--parent-filter
做同样的事情,但是脚本很快就会变得非常丑陋.)它还有一个优点是让你在更改成为永久性之前看到它们;如果看起来不好,只需删除移植文件即可中止.
Using grafts is likely more work than using --parent-filter
, but it does have the advantage of being able to graft together more than two histories with a single filter-branch
. (You could do the same with --parent-filter
, but the script would become very ugly very fast.) It also has the advantage of allowing you to see your changes before they become permanent; if it looks bad, just delete the graft file to abort.
要使用 R1(最旧)到 R5(最新)的移植方法,只需在移植文件中添加多行.(您运行 echo
命令的顺序无关紧要.)
To use the graft method with R1 (oldest) through R5 (newest), just add multiple lines to the graft file. (The order in which you run the echo
commands does not matter.)
git reset --hard R5/master
PARENT_REPO=R1
for CHILD_REPO in R2 R3 R4 R5; do
ROOT_OF_CHILD=$(git rev-list $CHILD_REPO/master | tail -n 1)
TIP_OF_PARENT=$(git rev-parse $PARENT_REPO/master)
echo "$ROOT_OF_CHILD" "$TIP_OF_PARENT" >> .git/info/grafts
PARENT_REPO=$CHILD_REPO
done
git rebase 怎么样?
其他一些人建议使用 git rebase R1/master
而不是上面的 git filter-branch
命令.这将获取空提交和 K
之间的差异,然后尝试将其应用于 D
,结果:
What about git rebase?
Several others have suggested using git rebase R1/master
instead of the git filter-branch
command above. This will take the diff between the empty commit and K
and then try to apply it to D
, resulting in:
A---B---C---D---K'---L'---M'---N'
如果在 D
和 K 之间删除文件,这很可能会导致合并冲突,甚至可能导致在
K'
中创建虚假文件代码>.唯一可行的情况是 D
和 K
的树是相同的.
This will most likely cause a merge conflict, and may even result in spurious files being created in K'
if a file was deleted between D
and K
. The only case in which this will work is if the trees of D
and K
are identical.
(另一个细微的区别是 git rebase
改变了 K'
到 N'
的提交者信息,而 git filter-分支
没有.)
(Another slight difference is that git rebase
alters the committer information for K'
through N'
, whereas git filter-branch
does not.)
这篇关于如何将 Git 存储库合并为线性历史记录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!