如何将历史记录添加到Git存储库中? [英] How do I prepend history to a Git repository?

查看:123
本文介绍了如何将历史记录添加到Git存储库中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个存在于两个SVN存储库中的项目.第二个SVN存储库是通过从旧SVN存储库的检出中添加存储库而创建的,而没有剥离SCM信息.这些文件的内容字节相同,但是没有关联的SCM元数据.

我已经使用了新的SVN存储库,并通过git-svn将其移植到Git存储库中.现在,我想导入旧的存储库,并以某种方式将其链接到新的存储库,这样我就可以看到两者之间的历史记录.是否有一种简单的方法,而无需手工将两个存储库缝合在一起?

解决方案

另请参见: 如何重播本地Git的提交仓库,在我在github.com上分叉的项目之上?问题(还有我的回答),尽管我认为情况略有不同.


您至少有三种可能性:

  • 使用 嫁接 来合并两个历史记录,但不要重写历史记录.这意味着您(以及拥有相同移植的任何人)将拥有完整的历史记录,而其他用户则拥有较小的存储库.如果有人已经开始使用较短的历史记录在转换后的存储库上进行操作,那么这也避免了重写历史记录的问题.

  • 使用嫁接连接两个历史记录,并使用"git log"或"gitk"(或其他Git历史记录浏览器/查看器)检查其是否正确,然后使用 git filter-branch ;那么您可以删除嫁接文件.这意味着从重写的存储库中克隆(获取)的每个人都将获得完整的,合并的历史记录.但是,如果有人已经基于转换后的短期存储库进行了工作,那么重写历史记录就大不了了(但是这种情况可能不适用于您).

  • 使用 git replace 加入两个历史.这将使人们可以选择获取refs/replace/(然后获得完整的历史记录)或不获取(然后获得简短的历史记录),从而选择是要完整的历史记录,还是仅需要当前的历史记录.不幸的是,这要求当前使用尚未发行的Git版本,开发版本(主版本")或1.6.5的发行候选版本之一. refs/replace/层次结构计划用于即将发布的Git版本1.6.5.


下面是所有这些方法的分步说明:嫁接(本地),使用嫁接的重写历史记录以及refs/replace/.

在所有情况下,我都假定您在一个存储库中同时具有当前和历史存储库历史记录(可以使用

发现要附加的提交(历史记录短的根源)

首先,您必须找到要附加到完整历史记录的简短历史记录中的提交(的SHA-1标识符).这将是历史上的第一次提交,即根提交(没有任何父母的提交).

有两种查找方法.如果确定没有其他根提交,则可以使用以下拓扑以拓扑顺序找到最后一个(最底部)提交:

$ git rev-list --topo-order master | tail -n 1

(其中tail -n 1用于获取输出的最后一行;如果没有它,则不需要使用它.)

如果有可能进行多次root提交,则可以使用以下一种代码找到所有无父母的提交:

$ git rev-list --parents master | grep -v ' '

(其中grep -v ' ',即单引号之间的空格,用于过滤所有具有任何父项的提交).然后,您必须检查(例如使用"git show <commit>")那些提交(如果有多个),然后选择要附加到较早历史记录的提交.

我们称此提交为尾.您可以使用(假设更简单的方法对您有用)将其保存在shell变量中:

$ TAIL=$(git rev-list --topo-order master | tail -n 1)

在下面的描述中,我将使用$TAIL来表示您必须替换当前(简短)历史记录中最底层提交的SHA-1 ...或允许shell为您进行替换. /p>

查找要附加到的提交(历史存储库的顶部)

这部分很简单.我们必须将提交的符号名称转换为SHA-1标识符.我们可以使用"git rev-parse"来做到这一点:

$ git rev-parse --verify history^0

(其中"history ^ 0"代替"history",以防万一,如果"history"是标签;我们需要提交的SHA-1,而不是标签对象).同样,就像查找要附加的提交一样,让我们​​将此提交ID命名为TOP.您可以使用以下命令将其保存在shell变量中:

$ TOP=$(git rev-parse --verify history^0)

使用嫁接文件加入历史记录

位于.git/info/grafts中的嫁接文件(如果不存在,则需要创建此文件,如果要使用此机制,则需要创建该文件)用于替换提交的父信息.它是基于行的格式,其中每一行包含我们要修改的提交的SHA-1,后跟零个或多个以空格分隔的提交列表,这些列表是我们希望给定提交作为父级提交的;与git rev-list --parents <revision>输出的格式相同.

我们希望没有父项的$ TAIL提交将$ TOP作为其单亲项.因此,在info/grafts文件中,应该在$ TAIL提交的SHA-1行中用$ TOP提交的SHA-1用空格分隔.您可以为此使用以下单行代码(另请参见 git filter-branch 文档):

$ echo "$TAIL $TOP" >> .git/info/grafts

现在,您应该使用"git log","git log --graph","gitk"或其他历史浏览器检查您是否正确加入了历史.

根据嫁接文件重写历史记录

请注意,这将更改历史记录!

要使嫁接文件中记录的历史记录永久存在,使用"git filter-branch"来重写所需的分支就足够了.如果只有一个分支需要重写(主"),则可以很简单:

$ git filter-branch $TOP..master

(这将仅处理最少的提交集).如果有更多分支受加入历史记录的影响,则可以简单地使用

$ git filter-branch --all

现在您可以删除嫁接文件.检查一切是否都如您所愿,并删除refs/original/中的备份(有关详细信息,请参见"git filter-branch"的文档).

使用refs/replace/机制

这是嫁接文件的替代方法.它具有可移植的优点,因此,如果您发布了简短的历史记录并且无法重写它(因为其他人基于他们的简短历史记录而工作),那么使用refs/replace/可能是一个很好的解决方案……至少当Git 1.6.5版本发布时.

refs/replace/机制的运行方式不同于嫁接文件:您可以替换对象而不是修改父级的信息.因此,首先您必须创建一个提交对象,该对象具有与$ TAIL相同的属性,但具有$ TOP作为父对象.

我们可以使用

$ git cat-file commit $TAIL > TAIL_COMMIT

(临时文件的名称仅是示例).

现在您需要编辑"TAIL_COMMIT"文件(看起来像这样):

tree 2b5bfdf7798569e0b59b16eb9602d5fa572d6038
author Joe R Hacker  1112911993 -0700
committer Joe R Hacker  1112911993 -0700

Initial revision of "project", after moving to new repository

现在,您需要通过在"tree"标头和"author"标头之间加上"parent $ TOP"(其中$ TOP必须扩展为SHA-1 id!)行来添加$ TOP作为父级.编辑"TAIL_COMMIT"后,它应如下所示:

tree 2b5bfdf7798569e0b59b16eb9602d5fa572d6038
parent 0f6592e3c2f2fe01f7b717618e570ad8dff0bbb1
author Joe R Hacker  1112911993 -0700
committer Joe R Hacker  1112911993 -0700

Initial revision of "project", after moving to new repository

如果需要,您可以编辑提交消息.

现在,您需要使用 git hash -object 在存储库中创建新的提交.您需要保存此命令的结果,即新提交对象的SHA-1,例如:

$ NEW_TAIL=$(git hash-object -t commit -w TAIL_COMMIT)

(此处使用'-w'选项将对象实际写入存储库中).

最后使用 git replace 进行将$ TAIL替换为$ NEW_TAIL:

$ git replace $TAIL $NEW_TAIL

现在要检查的历史记录是否正确(使用"git log"或其他历史记录查看器).

现在,任何想要拥有完整历史记录的人都需要添加'+refs/replace/*:refs/replace/*'作为pull refspecs之一.

最后的提示: 我尚未检查此解决方案,因此您的里程可能会有所不同.

I have a project that has existed in two SVN repositories. The second SVN repository was created simply by adding the repositories from a checkout of the old SVN repository without SCM information stripped. The content of the files are byte identical, but there is no associated SCM meta-data.

I have taken the new SVN repository and ported it into a Git repository via git-svn. Now I would like to import the old repository and somehow get it to link the new repository so I can see the history across both. Is there a simple way to do this without hand stitching the two repositories together?

解决方案

See also: the How do I re-play my commits of a local Git repository, on top of a project I forked on github.com? question (and my answer there), although the situation is slightly different, I think.


You have at least three possibilities:

  • Use grafts to join two histories, but do not rewrite history. This means that you (and anybody who has the same grafts) would have full history, while other users would have a smaller repository. This also avoids problems with rewritten history if somebody already started working on top of the converted repository with a shorter history.

  • Use grafts to join two histories, and check that it is correct using "git log" or "gitk" (or other Git history browser/viewer), then rewrite history using git filter-branch; then you can remove the grafts file. This means that everybody who clones (fetches) from a rewritten repository would get the full, joined history. But rewriting history is a big no if somebody already based work on converted short-history repository (but this case might not apply to you).

  • Use git replace to join two histories. This would allow people to select whether they want full history, or just current history, by choosing to fetch refs/replace/ (then they get full history) or not (then they get short history). Unfortunately this requires currently to use a yet unreleased version of Git, using the development ('master') version, or one of the release candidates for 1.6.5. The refs/replace/ hierarchy is planned for the upcoming Git version 1.6.5.


Below there are step-by-step instructions for all those methods: grafts (local), rewriting history using grafts, and refs/replace/.

In all cases I assume that you have both the current and historical repository history in a single repository (you can add history from another repository using git remote add). I also assume that (one of) the branches in the short-history repository is named 'master', and that the branch (commit) of the historical repository where you want to attach current history is called 'history'. You would have to substitute your own branch names (or commit IDs).

Finding commit to attach (root of short history)

First, you have to find the (SHA-1 identifier of) commit in short-history that you want to attach to the full history. It would be the first commit in the short history, i.e. the root commit (the commit without any parents).

There are two ways of finding it. If you are sure that you do not have any other root commit, you can find the last (bottommost) commit in topological order, using:

$ git rev-list --topo-order master | tail -n 1

(where tail -n 1 is used to get the last line of the output; you don't need to use it if you don't have it.)

If there is possibility of multiple root commits, you can find all parentless commits using the following one-liner:

$ git rev-list --parents master | grep -v ' '

(where grep -v ' ', that is, space between single quotes, is used to filter out all commits which have any parents). Then you have to check (using e.g. "git show <commit>") those commits if there are more than one, and select one that you want to attach to earlier history.

Let's call this commit TAIL. You can save it in a shell variable using (assuming that simpler method works for you):

$ TAIL=$(git rev-list --topo-order master | tail -n 1)

In the description below I would use $TAIL to mean that you have to substitute the SHA-1 of the bottommost commit in the current (short) history... or allow the shell to do the substitution for you.

Finding a commit to attach to (top of the historical repository)

This part is simple. We have to the convert the symbolical name of the commit into an SHA-1 identifier. We can do this using "git rev-parse":

$ git rev-parse --verify history^0

(where 'history^0' is used in place of 'history' just in case if 'history' is a tag; we need the SHA-1 of the commit, not of a tag object). Similarly, like finding a commit to attach, let's name this commit ID TOP. You can save it in a shell variable using:

$ TOP=$(git rev-parse --verify history^0)

Joining history using a grafts file

The grafts file, located in .git/info/grafts (you need to create this file if it doesn't exist, if you want to use this mechanism) is used to replace the parent information for a commit. It is line-based format, where each line contains the SHA-1 of a commit we want to modify, followed by zero or more space-separated lists of commits we want for given commit to have as parents; the same format that "git rev-list --parents <revision>" outputs.

We want $TAIL commit, which doesn't have any parents, to have $TOP as its single parent. So in the info/grafts file there should be a line with the SHA-1 of the $TAIL commit, separated by space by the SHA-1 of the $TOP commit. You can use the following one-liner for this (see also examples in git filter-branch documentation):

$ echo "$TAIL $TOP" >> .git/info/grafts

Now you should check, using "git log", "git log --graph", "gitk" or other history browser that you joined histories correctly.

Rewriting history according to the grafts file

Please note that this would change history!

To make history as recorded in grafts file permanent, it is enough to use "git filter-branch" to rewrite the branches you need. If there is only a single branch that needs to be rewritten ('master'), it can be as simple as:

$ git filter-branch $TOP..master

(This would process only minimal set of commits). If there are more branches affected by joining history, you can simply use

$ git filter-branch --all

Now you can delete the grafts file. Check if everything is like you wanted, and remove backup in refs/original/ (see documentation for "git filter-branch" for details).

Using refs/replace/ mechanism

This is an alternative to the grafts file. It has the advantage that it is transferable, so if you published the short history and cannot rewrite it (because other based their work on the short history), then using refs/replace/ might be a good solution... well, at least when Git version 1.6.5 gets released.

The refs/replace/ mechanism operates differently than a grafts file: instead of modifying the parent's information, you replace objects. So first you have to create a commit object which has the same properties as $TAIL, but has $TOP as a parent.

We can use

$ git cat-file commit $TAIL > TAIL_COMMIT

(The name of temporary file is only an example).

Now you need to edit 'TAIL_COMMIT' file (it would look like this):

tree 2b5bfdf7798569e0b59b16eb9602d5fa572d6038
author Joe R Hacker  1112911993 -0700
committer Joe R Hacker  1112911993 -0700

Initial revision of "project", after moving to new repository

Now you need to add $TOP as parent, by putting a line with "parent $TOP" (where $TOP has to be expanded to SHA-1 id!) between 'tree' header and 'author' header. After editing 'TAIL_COMMIT' it should look like this:

tree 2b5bfdf7798569e0b59b16eb9602d5fa572d6038
parent 0f6592e3c2f2fe01f7b717618e570ad8dff0bbb1
author Joe R Hacker  1112911993 -0700
committer Joe R Hacker  1112911993 -0700

Initial revision of "project", after moving to new repository

If you want, you can edit the commit message.

Now you need to use git hash-object to create a new commit in the repository. You need to save the result of this command, which is the SHA-1 of a new commit object, for example like this:

$ NEW_TAIL=$(git hash-object -t commit -w TAIL_COMMIT)

(Where the '-w' option is here to actually write the object to the repository).

Finally use git replace to replace $TAIL by $NEW_TAIL:

$ git replace $TAIL $NEW_TAIL

Now what is left to check (using "git log" or some other history viewer) if the history is correct.

Now anybody who wants to have the full history needs to add '+refs/replace/*:refs/replace/*' as one of pull refspecs.

Final note: I have not checked this solution, so your mileage may vary.

这篇关于如何将历史记录添加到Git存储库中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆