git-subtree不保留历史记录,所以我无法推送子树更改,我如何解决此问题/避免将来出现此问题? [英] git-subtree is not retaining history so I cannot push subtree changes, how can I fix this/avoid this issue in the future?

查看:166
本文介绍了git-subtree不保留历史记录,所以我无法推送子树更改,我如何解决此问题/避免将来出现此问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在使用git-subtree扩展名( https://github.com/apenwarr/git -subtree )来管理我们主项目中的子项目。它正在做我想要的事情,除了当我尝试从我们的主项目中分离出对一个子项目所做的更改时,它失败了。

例如。早先我已经完成了

  git subtree add -P Some / Sub / Dir --squash git @ gitserver:lib.git master 

将库代码带入我们主项目中的Some / Sub / Dir。这里的一切都非常好,所以我把我的改变推到了我们的主要项目裸Git回购。然后,我决定在Some / Sub / Dir中更改我的本地lib版本,将其提交,然后将其分离出来,以便将其推回到lib.git仓库中。

  git subtree split -P Some / Sub / Dir -b some_branch 



一切都按预期工作。不再需要repo的本地副本,我将其删除。



从我们的中央仓库克隆了一个新版本的repo后,我对Some / Sub / Dir,并决定我想将这些更改分开并推回到lib.git存储库。我尝试使用与以前相同的子树分割命令,但是这次我输出了以下内容:

  1/3 (0)
2/3(1)
3/3(1)
致命:坏对象d76a03f0ec7e20724bcfa253e6a03683211a7bb1

d76a03f0ec7e20724bcfa253e6a03683211a7bb1来自我添加子树时:

 提交43b3eb7d69d5eb64241eddb12e5bd74fd0215083 
作者:Ian Bond< ibond@onezero.com>
日期:星期五4月22日15:06:50 2011 -0400

从提交d76a03f中扣除'Subtree / librepoLib /'内容

git-subtree-dir: Subtree / librepoLib
git-subtree-split:d76a03f0ec7e20724bcfa253e6a03683211a7bb1

实际上是指提交lib.git回购。




我可以拼凑在一起(而且我是git noob所以我可能是错的,忽略了某些东西,或者在这里使用了不正确的术语),是'git subtree add --squash'会将远程lib.git仓库中的整个历史记录带入当前仓库,将其压缩到单独的提交中,然后将该提交添加到工作分支中。 lib.git提交历史记录保留在当前的回购站中,但是它们悬而未决,因为它们实际上并没有被引用,除了通过提交壁球的文本。只要这些悬挂的提交仍然存在,git-subtree可以使用它们来执行拆分,但由于推或拉不包含悬挂对象(或者如果我运行gc并完全修剪悬挂对象),那些悬挂的提交将丢失, git-subtree不再具有执行拆分所需的信息。



我添加了一个脚本,可以完全重现我遇到的问题。






我的问题是:



1)我可以做些什么来处理现在的情况,我现在想要合并回原始回购的子树,但不再有任何历史将它们联系在一起。我目前的想法是这样做:

  git subtree split -P Some / Sub / Dir 43b3eb7 ^ .. --ignore -joins -b splitBranch 

将'git子树添加'和合并之后的所有历史记录分开它回到原始回购(自从添加以来,幸好没有任何变化)。这是最好的方式吗?我应该如何执行合并的任何建议?



2)有什么我可以做的,使git-subtree按预期工作?我相信如果我省略了'git subtree add'的--squash参数,那么一切都会奏效,但是这会导致一堆无关的历史记录被注入到我的回购中。是否有一些方法来保持所需的提交(最好是不保留整个库的历史记录)? 解决方案

git subtree split 是在子树的原始历史记录之上创建一些新的提交(代表最初在子树的本地目录中进行的本地更改)。由于它直接涉及子树的原始历史(作为触及子树的第一个重写本地提交的父提交),因此如果没有子树的原始历史本身存在,则无法完成拆分操作。



想想你将如何处理 git subtree split 生成的历史记录。您可能想将其推送到一个存储库,您可以将其合并到上游历史记录的其余部分。为了使这个合并操作有意义,分割历史需要基于原始历史本身。。



可能是最可靠的安排用户拥有子树的原始历史记录的方法是在文档中发布子树的上游存储库的URL,并让它们为其定义一个远程(在单个存储库中拥有不相关的远程控制器是完全正确的)。例如


如果您需要使用 Some / Sub / Dir (引入外部更改或推出本地更改),请在使用 git subtree 之前定义并更新库存储库的远程:

  git remote add lib git @ host:the-lib-repository&& 
git fetch lib


您需要做一些事情就像这样,即使你不使用 - squash ,因为用户需要知道在哪里获得新的上游提交(以及在哪里(最终)推送新的分割生成的提交) 。
$ b 使用 - squash 为您的主项目提供了一个干净的历史记录,并且意味着只有那些用户需要处理子树的上游实际上必须在其存储库中有其对象。






看起来像你对对象模型有很好的理解。你是正确的, git subtree add --squash 引入的历史将变成摇晃 2 ,但是 git subtree split



(参考您的复制脚本)

您可以使用它仅仅因为本地克隆自动硬连接(或复制) .git / objects / 中的所有文件,才能成功拆分 repoMainClone >(因此可以从 repoLib 中访问 repoMain 的悬挂(或接近悬挂的 2 code>)而不是使用通常的包协议传输(这会将传输的对象限制为仅传输的引用所需的对象;即省略 repoLib )。您的 repoMainPull 实际上等同于克隆 file://$(pwd)/ repoMain repoMainCloneFile file:// URL强制本地克隆使用基于包的传输,而不仅仅是链接/复制所有内容。)




1
事实上,你可以直接合并不相关的历史记录,但是你失去了进行三路合并的能力(因为没有共同的祖先)。这将是相当大的牺牲。



您提议的 git subtree split -P Some / Sub / Dir 43b3eb7 ^ .. --ignore-joins ... (其中43b3eb7是由 git subtree add --squash ... 产生的合成提交),会生成一个不相关的历史记录(除非它需要 43b3eb7 .. 43b3eb7 ^ 表示43b3eb7的第一个父亲,而43b3eb7没有父母)。我不确定 git subtree split 是否被设计为可以使用这样的范围。 git subtree split 的文档只是说< commit> ,但从未真正提及它的目的。读代码表明它默认为HEAD,这可能表明它是一个单独的提交,指定应该处理分割的历史的提示。此外,打开调试输出显示消息不正确的顺序:这可能表示使用范围参数将分裂操作置于意外情况(期望已经处理在处理提交本身之前所有的提交父项,但范围确保从未处理43b3eb7(它是子树合并提交的父项)。我想你可以使用 - ignore-splits ,如果你想产生不相关的历史记录并试图以某种方式使用它: git subtree split -P Some / Sub / Dir --ignore-joins ...



2
它们实际上并不是立即在 git subtree add --squash 后面悬挂,因为它们仍然由FETCH_HEAD引用。然而,一旦无关联的提取完成,它们将变得真正悬而未决。


I've been using the git-subtree extension (https://github.com/apenwarr/git-subtree) to manage sub-projects within our main project. It's doing exactly what I want other than the fact that it fails when I try to split out changes made to a sub-project from our main project.

e.g. earlier on I had done

git subtree add -P Some/Sub/Dir --squash git@gitserver:lib.git master

to bring in the library code to Some/Sub/Dir in our main project. Everything here went great so I then pushed my changes to our central main project bare git repo. I then decide to make a change to my local version of the lib in Some/Sub/Dir, commit it, then split it out to push it back to the lib.git repo

git subtree split -P Some/Sub/Dir -b some_branch

everything works as expected. No longer needing the local copy of the repo I deleted it.

After cloning a new copy of the repo from our central repo I made some changes to the lib in Some/Sub/Dir and decided I wanted to split those changes out and push them back to the lib.git repository. I attempt to use the same subtree split command as before, however this time I end up with the following output:

1/      3 (0)
2/      3 (1)
3/      3 (1)
fatal: bad object d76a03f0ec7e20724bcfa253e6a03683211a7bb1

d76a03f0ec7e20724bcfa253e6a03683211a7bb1 comes from when I added the subtree:

commit 43b3eb7d69d5eb64241eddb12e5bd74fd0215083
Author: Ian Bond <ibond@onezero.com>
Date:   Fri Apr 22 15:06:50 2011 -0400

    Squashed 'Subtree/librepoLib/' content from commit d76a03f

    git-subtree-dir: Subtree/librepoLib
    git-subtree-split: d76a03f0ec7e20724bcfa253e6a03683211a7bb1

which actually refers to a commit in the lib.git repo.


What I've been able to piece together (and I'm a git noob so I may be wrong, overlooking something, or using incorrect terminology here), is that 'git subtree add --squash' will bring in the entire history from the remote lib.git repo into the current repo, squash it down into a separate commit, then add that commit into the working branch. The lib.git commit history remains in the current repo, however they're dangling commits since they're not actually referenced other than through the text of the squash commit. As long as those dangling commits remain, git-subtree can use them to perform splits, however since a push or pull doesn't contain dangling objects (or if I run a gc and fully prune dangling objects), those dangling commits are lost and git-subtree no longer has the necessary information to perform the split.

I've added a script that will fully reproduce the issues I've been having.


My questions are:

1) What can I do to handle the existing situation where I now have subtrees that I want to merge back to their origin repo, but no longer have any sort of history that links them together. My current thought is to do something like:

git subtree split -P Some/Sub/Dir 43b3eb7^.. --ignore-joins -b splitBranch

to split out all of the history since the 'git subtree add' and merge it back into the origin repo (which thankfully has not had any changes since the add). Is this the best way to go? Any recommendations for how I should perform the merge?

2) Is there anything I can do to make git-subtree work as expected? I believe if I omit the --squash parameter on 'git subtree add' then everything will work, however that causes a bunch of unrelated history to be injected into my repo. Is there some way to keep the needed commits around (preferably without keeping the entire history of the library around)?

解决方案

The purpose of git subtree split is to create some new commits (representing "local" changes originally made in the subtree’s local directory) on top of the subtree’s original history. Since it directly involves the subtree’s original history (as the parent commit of the first rewritten local commit that touches the subtree), the split operation can not be done without the subtree’s original history itself being present.

Think about what you will be doing with the history that git subtree split generates. You will probably want to push it to a repository where you can merge it into the rest of the "upstream" history. In order for this merge operation to make sense, the split history needs to be based on the original history itself1.

Probably the most reliable way to arrange for users to have the subtree’s original history is to publish the URL for the subtree’s upstream repository in your documentation and have them define a remote for it (it is perfectly fine to have "unrelated" remotes in a single repository). E.g.

If you need to work with the "upstream" of Some/Sub/Dir (to pull in external changes or push out local changes), please define and update a remote for the library’s repository before using git subtree:

git remote add lib git@host:the-lib-repository &&
git fetch lib

You would need to do something like this even if you were not using --squash since users would need to know where to get new upstream commits (and where (ultimately) to push new split-generated commits).

Using --squash gives you a "clean" history in your main project and means that only those users that need to deal with the subtree’s "upstream" actually have to have its objects in their repositories.


It seems like you have a good understanding of the object model. You are correct that the history that git subtree add --squash pulls in will become dangling2 but that git subtree split can still use it until it is pruned away.

(with reference to your reproduction script)
You are able to successfully split in your repoMainClone only because local clones automatically hardlink (or copy) all the files in .git/objects/ (thus getting access to repoMain’s copies of the dangling (or nearly dangling2) objects from repoLib) instead of using the usual "pack protocol" transport (which would limit the transferred objects to only those needed for the transferred refs; i.e. omitting anything from repoLib). Your repoMainPull is effectively equivalent cloning file://"$(pwd)"/repoMain repoMainCloneFile (the file:// URL forces local clones to use pack-based transfers instead of just linking/copying everything).


1 Actually, you can directly merge unrelated histories, but you lose the ability to do three-way merges (since there is no common ancestor). This would be quite a sacrifice.

Your proposed git subtree split -P Some/Sub/Dir 43b3eb7^.. --ignore-joins … (where 43b3eb7 is the synthetic commit that resulted from git subtree add --squash …), would generate an unrelated history (except it needs to be 43b3eb7.. since 43b3eb7^ means "the first parent of 43b3eb7" and 43b3eb7 has no parents). I am not sure that git subtree split was designed to take ranges like this though. The documentation for git subtree split just says <commit>, but never really mentions its purpose. Reading the code shows that it defaults to HEAD, which might indicate that it is intended to be a single commit specifying the "tip" of the history that should be processed for splitting. Also, turning on the debug output shows a message incorrect order: which might indicate that using a range argument is putting the split operation in an unexpected situation (it is expecting to have processed all of the parents of a commit before processing the commit itself, but the range ensures that 43b3eb7 (which is the parent of the subtree merge commit) is never processed). I think you can just use --ignore-splits and leave off the range if you want to generate "unrelated" history and try to use it in some way: git subtree split -P Some/Sub/Dir --ignore-joins ….

2 They are not actually dangling immediately after git subtree add --squash because they are still referenced by FETCH_HEAD. Once an unrelated fetch is done, however, they will become truly dangling.

这篇关于git-subtree不保留历史记录,所以我无法推送子树更改,我如何解决此问题/避免将来出现此问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆