GIT:如何重新设置嵌套分支的基准? [英] GIT: How do I rebase nested branches?

查看:52
本文介绍了GIT:如何重新设置嵌套分支的基准?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的结构看起来像这样->

  master开发项目< sprint_number>< task_number> 

我在task_number分支上工作.然后,我将任务与sprint分支合并.然后,我将sprint与项目分支合并.这样,项目上的所有提交都是sprint,而项目上的所有提交都是任务.合并到项目分支后,我提交合并请求,并在合并到开发之前执行代码审查.

我应该在整个链中进行重新定位吗?例如:

  git checkout开发git rebase mastergit checkout项目git rebase开发git checkout< sprint_number>git rebase项目git checkout< task_number>git rebase< sprint_number> 

解决方案

Git分支名称实际上没有任何意义的嵌套:它们只是指向特定提交的指针.

首先,绘制(部分)提交DAG

与往常一样,我们在这里需要做的是绘制一些提交 D 插入 A 循环 G rap(DAG)片段并考虑重新定级很有意义的情况.因此,我们从您的示例开始:

  master开发项目< sprint_number>< task_number> 

并添加一些节点(并给它们一个大写字母,而不是像 a1cf93a ... 这样的真实名称"散列,因为它们太大且笨拙):

  A<-B<-C<-主\D<-E<-发展\F<-G<-项目\H<-< sprint_number>\我<-< task_number> 

(这里的反斜杠应该是上下箭头,但是很难用纯文本绘制).

也就是说,在这种情况下,我们(至少)在 master 上进行了3次提交(在提交 A 之前可能有任意数量的提交,而这些提交我们根本没有绘制过)在). master 的技巧是提交 C ,它指向提交 B ,它指向 A .

我们在 develop 上有两个提交,而在 master 上也没有:commit E develop 和 E 指向 D ,而 D 指向 B .提交 B 及其所有祖先( A 和更早的版本)都在 master develop .

同时提交 G project 的技巧; G 指向 F ,它指向 E ,依此类推.这意味着实际上在所有三个分支上,提交 A B .但是,等等,还有更多! H < sprint_number> 的尖端,而 H 指向 G ,以此类推;而 I < task_number> 的尖端,而 I 则指向 H .

最后,这意味着 A B 提交(至少)在五个分支(此处显示的五个)和 D 上code>和`E至少在四个分支上,依此类推.

确定是否需要重新定级

在git中,重新设置实际上意味着将提交复制到新的,稍有不同的/已修改的提交.(这可能不是正确的方法.不过,稍后再讲,因为除非您了解更多,否则它才有意义.)

master 的提示现在是提交 C 而不是提交 B .大概在更早的时候,掌握技巧的人是 B ,那是当我们提交 D (也可能是 E )时.但是现在您正在考虑将 develop 重新部署到 master 的新技巧上.

要实现此目的,您必须复制 D E 提交到新的不同的提交中.我们将这些副本称为 D' E'.即使没有其他变化,也可能有其他变化,特别是 B C 之间的区别会被放入新的 D'-原始提交 D 的副本 D'必须指向提交 C ,而不是提交 B .

仅在复制阶段绘制 (将所有悬挂在原始 E 上的东西都删除了),我们得到:

  A-B-C<-主\\ D'-E'<-开发(重新设置基准后)\D-E [已放弃] 

(这一次我也简化了左箭头,现在我们知道提交点指向左.)但是,原来的 D E 不再由分支名称 develop 指向,一旦我们填写了其余的图形,它们仍然可达:

  A-B-C<-主\\ D'-E'<-开发(重新设置基准后)\D-E\F-G<-项目\H<-< sprint_number>\我<-< task_number> 

这时特别重要的是,原始提交 D E 不再在 develop 上.

变基的工作原理

忽略-fork-point (可以在此处解决), git rebase 命令实际上需要三个参数,其中一个通常是从<代码> HEAD :

  • 要复制的最尖端提交(通常只是您当前的分支",即 HEAD );
  • 限制的承诺要复制的说明符,即(但间接地)指定进行复制;和
  • 将向其添加第一个复制的提交的提交的身份.

后两个通常组合成一个< upstream> 参数.同时,您首先对分支进行 git checkout 进行重新设置,以设置第一个参数.例如,如果我们决定将 develop 改组为 master :

  git checkout开发git rebase master 

这里要复制的最尖端的提交是照常执行的 HEAD 提交,由于 git checkout develop ,新副本将在此开始的起始位置是 master 的提示.Git首先考虑应对 develop 上的每个 提交(这将是 A B D E ),但是在这里告诉我们避免复制 master 上的每个提交,这意味着A B C .

(等等,什么?我们不应该复制 C ?但是我们首先不打算复制 C !好吧,没问题,我们就不会复制它!)这样,我们便可以将这两件事合并为一个< upstream> 参数.我们要在 C 之后添加新副本,同时避免复制 C 以及从 C 返回的路径中的所有内容.

因此,如果我们选择继续并执行 git rebase ,我们会将 D E 复制到 D' E',最后得到我们绘制的新图形片段.

这对 develop 来说很好,但是如果我们这样做了,现在会发生什么:

  git checkout项目git rebase开发 

这一次,我们将要求git复制从 project 的尖端可以访问的所有内容,这些是 G F E D B A (也许还有更多)-到已经重新定位的 develop ,即提交 E'.

这是一个问题.如果幸运的话,这可能是一种自我解决的方法,因为rebase会检测到一些情况,包括复制的提交并避免重新复制它们.也就是说,当git将 D 复制到(另一个)新副本 D''时,它可能检测到 D 已经存在于 E'中.如果确实检测到此错误,它将跳过该副本.将 E 复制到 E''时也会发生同样的情况:可能检测到不需要此代码,并跳过副本./p>

另一方面,git的检测器可能被骗了,它可能会复制 D 和/或 E .我们绝对不想要那样,所以最好避免让git完全复制它们.

有多种询问方法,包括交互式变基(我们可以在其中编辑 pick 指令,因此我们可以删除两条 pick 行进行提交) D E ),或者更聪明地使用 git rebase 的参数:

  git checkout项目git rebase --on开发'project @ {1}' 

第二个命令使用reflog历史记录来告诉git,要复制的提交是 previous 中未包含的 project (当前分支)上的提交. project 的提示.也就是说,'project @ {1}'解析为原始(未复制)提交 E 的提交ID.因此,这只会将提交 F G 复制到 F' G'.

(顺便说一句,如果您在带有彩色标记的白板上绘制DAG,则可以使用颜色表示原始提交及其副本.我发现,与所有 D'相比,它更易于阅读,并且 D''表示法.我只是不能在StackOverflow上绘制它.)

我们可以使用sprint和任务重复此过程,使用reflog标识要忽略的提交.

从git 1.9开始, git rebase 现在有了-fork-point ,它实质上通过reflogs使我们在这里所做的工作自动化.(git 2.1中有一个错误修复程序,用于 git rebase --fork-point 无法发现不需要复制的提交,因此明智的做法是将此选项限制为2.1-或更高版本.)因此,这可能是实现此目的的一种方法.

最后,在回到关于这是否是个好主意的问题之前,我将再作一个说明.假设我们开始了,而不是在 master 上的 develop 和在 develop 上的 project 上重新建立基础,等等.通过重新分配任务.这将告诉git将提交 D 复制到 D' E E' F F',等等,一直到将 I 复制到 I'.然后,任务分支将指向新的提交 I',其历史链可追溯到 C .现在,我们需要做的就是在复制的提交处通过找到正确的副本来重新指向sprint分支, project 分支和 develop 分支.更新的 develop 应该指向 E';更新的 project 应该指向 G';并且更新的sprint分支应指向 H'.

如果还有其他的sprint和/或任务分支,则可能需要复制一些上述内容无法复制的提交,因此,必须谨慎使用此技巧.与往常一样,它将有助于首先绘制DAG.

重新设基对吗?

如果您的分支结构如此复杂,则重新设置基准可能是错误的方法.即使不是,也可能是错误的方法.

请记住,就像我们刚刚看到的那样,重新定基涉及到复制提交,然后将分支标签移动到指向新副本,而不是原始副本.当您仅使用使用的存储库来进行此操作时,通常不会太令人困惑,因为您移动了所有分支标签,现在就完成了:您要么具有旧的,复制前的状态,或新的复制后状态,您可以忽略所有中间状态(中间变基),只是短暂地进行所有这些变基.

但是,如果其他人正在共享此存储库,请考虑您将对他们执行 .在进行所有这些大型基础调整之前,他们具有他们认为正确的 develop project ,sprint和任务分支指针的思想.他们正在使用原始(尚未复制)的提交,并根据这些原始提交进行自己的新提交.

现在,您来告诉他们:哦,嘿,忘掉所有那些旧的提交!改为使用这些全新的闪亮的提交!"现在,他们必须去查找他们所做的一切依赖于提交的内容,并更新所有这些内容以使其依赖于新的提交.

换句话说,它们必须处理上游资源库",或者实际上是来自众多上游资源库.通常没有什么好玩的(尽管使用相同的-fork-point 代码可以自动执行此操作,也可以使他们自动执行从上游基准库中恢复的操作).

-fork-point 上有时间限制,因为它使用了reflog条目,并且reflog条目过期.如果您尚未重新配置内容,则git默认会在30天后过期重要的reflog条目,因此如果您这样做,其他人大约有一个月的时间可以恢复.

My structure looks like this->

master
  develop 
    project
      <sprint_number>
        <task_number>

I work on the task_number branch. Then I merge task with the sprint branch. Then I merge sprint with the project branch. In this way, all of the commits on project are sprints, and all of the commits on sprint are tasks. After merging into project branch, I submit a merge request and a code review is performed before merging into develop.

Should I do a rebase all the way down the chain? For example:

git checkout develop
git rebase master
git checkout project
git rebase develop
git checkout <sprint_number>
git rebase project
git checkout <task_number>
git rebase <sprint_number>

解决方案

Git branch names don't actually nest in any sense: they're just pointers to specific commits.

First, draw (part of) the commit DAG

What we need to do here, as usual, is draw some commit Directed Acyclic Graph (DAG) fragments and consider cases where rebasing makes sense. So we start with your example:

master
  develop 
    project
      <sprint_number>
        <task_number>

and add some nodes (and give them single-uppercase-letters instead of their "true name" hashes like a1cf93a... since those are too big and unwieldy):

A <- B <- C                <-- master
      \
       D <- E              <-- develop
             \
              F <- G       <-- project
                    \
                     H     <-- <sprint_number>
                      \
                       I   <-- <task_number>

(the backslashes here should be up-and-left arrows but those are too hard to draw in plain text).

That is, in this case we have (at least) three commits on master (there may be any number of commits before commit A that we simply did not draw in). The tip of master is commit C, which points back to commit B, which points back to A.

We have two commits on develop that are not also on master: commit E is the tip of develop and E points back to D, while D points back to B. Commit B, along with all of its ancestors (A and anything earlier), is on both master and develop.

Meanwhile commit G is the tip of project; G points back to F which points back to E, and so on. This means commits A and B are, in fact, on all three branches. But wait, there's more! H is the tip of <sprint_number> and H points back to G, and so on; and I is the tip of <task_number> and I points back to H.

In the end, this means that commits A and B are on (at least) five branches (the five shown here), and D and `E are on at least four branches, and so on.

Decide if rebasing is needed and allowed

In git, rebasing actually means copying commits to new, slightly different/modified commits. (This may not be the right approach. We'll get to that later, though, because it won't make sense until you know more.)

The tip of master is now commit C rather than commit B. Presumably, earlier, the tip of master was B, and that was when we made commit D (and maybe E as well). But now you're considering rebasing develop onto the new tip of master.

To achieve this you must copy commits D and E to new, different commits. We'll call these copies D' and E'. Even if nothing else changes—and it's likely that something else does change, specifically whatever is different between B and C will go into the new D'—the copy D' of original commit D has to point to commit C rather than to commit B.

Drawing just this copy phase (leaving out everything hung off the original E) we get:

A - B - C             <-- master
     \    \
      \     D' - E'   <-- develop (after rebase)
       \
        D - E         [abandoned]

(I've simplified the left pointing arrows this time too, now that we know that commits point leftward.) But while the original D and E are no longer pointed-to by branch name develop, they're still reachable once we fill in the rest of the drawing:

A - B - C             <-- master
     \    \
      \     D' - E'   <-- develop (after rebase)
       \
        D-E
           \
            F-G       <-- project
               \
                H     <-- <sprint_number>
                 \
                  I   <-- <task_number>

What's particularly significant at this point is that original commits D and E are *no longer on develop.

How rebase works

Ignoring --fork-point (which can be a solution here), the git rebase command really takes three arguments, one of which is normally just taken from HEAD:

  • the tip-most commit to copy (this is normally just "your current branch", i.e., HEAD);
  • a specifier that limits which commits to copy, i.e., specifies—but indirectly—commits not to copy; and
  • the identity of the commit to which the first copied commit will be added.

The latter two are usually combined into one <upstream> argument. Meanwhile you first do a git checkout of the branch to rebase, to set the first argument. For instance, if we were to decide to rebase develop onto master:

git checkout develop
git rebase master

Here the tip-most commit to copy is the HEAD commit as usual, which because of the git checkout is the tip-most commit of develop, and the starting place at which the new copies will be grown is the tip of master. Git starts by considering coping every commit that is on develop (which would be A, B, D, and E), but it's told here to avoid copying every commit that is on master, which means A, B, and C.

(Wait, what? We're not supposed to copy C? But we weren't going to copy C in the first place! Well, no problem then, we just won't copy it!) That's how we can combine the two things into one <upstream> argument. We want to add the new copies after C, and at the same time, avoid copying C and everything in the path leading back from C.

So if we choose to go ahead and do this git rebase, we'll copy D and E to D' and E' and end up with the new graph fragment we drew.

That's great for develop, but what happens now if we do:

git checkout project
git rebase develop

This time, we'll ask git to copy everything reachable from the tip of project—these are G, F, E, D, B, and A (and maybe something more)—to the tip of the already-rebased develop, i.e., commit E'.

This is a problem. It may be a self-solving one, if we're lucky, because rebase will detect some cases of copied commits and avoid re-copying them. That is, when git goes to copy D to a(nother) new copy D'', it may detect that D is already present in E'. If it does detect this it will just skip the copy. The same happens when it goes to copy E to E'': it may detect that this is not needed, and skip the copy.

On the other hand, git's detector may be fooled, and it might copy D and/or E. We definitely don't want that, so it's best to avoid asking git to copy them at all.

There are a number of ways to ask, including an interactive rebase (where we get to edit the pick instructions, so we can delete the two pick lines for commits D and E), or being more clever with arguments to git rebase:

git checkout project
git rebase --onto develop 'project@{1}'

This second command uses the reflog history to tell git that the commits to copy are those that are on project (the current branch) that are not contained within the previous tip of project. That is, 'project@{1}' resolves to the commit ID of original (un-copied) commit E. This will therefore copy just commits F and G, to F' and G'.

(Incidentally, if you draw your DAGs on a whiteboard with colored markers, you can use colors to represent the original commits and their copies. I find this easier to read than all the D' and D'' notation. I just can't draw it on StackOverflow.)

We can repeat this process with the sprint and task, using the reflog to identify commits to leave out.

Since git 1.9, git rebase now has --fork-point, which essentially automates what we're doing here with the reflogs. (There was a bug fix in git 2.1 for git rebase --fork-point failing to discover commits that don't need to be copied, so it would be wise to limit using this option to 2.1-or-later.) That could therefore be a way to do this.

Finally, before returning to the question of whether this is a good idea at all, I'll make one more note. Instead of rebasing develop on master, and project on develop, and so on, suppose we started by rebasing the task. This would tell git to copy commit D to D', E to E', F to F', and so on all the way down to copying I to I'. The task branch would then point to new commit I', whose history chain reaches back to C. Now all we need to do here is re-point the sprint branch, the project branch, and the develop branch at the copied commits, by finding the right copy. The updated develop should point to E'; the updated project should point to G'; and the updated sprint branch should point to H'.

If there are additional sprint and/or task branches, they probably need to have some commit(s) copied that would not be copied by the above, though, so this trick has to be used carefully. As always, it will help to draw the DAG first.

Is rebasing right?

If you have a branch structure this complex, rebasing may be the wrong approach. Even if not, it may still be the wrong way to do this.

Remember that, as we just saw, rebasing involves copying commits, and then moving branch labels to point to the new copies, instead of the originals. When you do this with a repository that only you use, it's usually not too terribly confusing, because you move all your branch labels and you are now done: you either have the old, pre-copy state, or the new, post-copy state, and you can ignore all the intermediate (mid-rebase) state except for the brief period of doing all these rebases.

If someone else is sharing this repository, though, consider what you will do to them. Before you did all this massive rebasing, they had what they thought were the right develop, project, sprint, and task branch pointers. They were using the original (not yet copied) commits and making their own new commits that depend on those original commits.

Now you come along and tell them: "Oh, hey, forget all those old commits! Use these brand-new shiny ones instead!" Now they have to go find everything they did that depended on the old commits, and update all of those to depend instead on the new ones.

In other words, they must deal with an "upstream rebase"—or in fact, from numerous upstream rebases. It's generally not a lot of fun (though the same --fork-point code that makes it possible for you to automate this, also makes it possible for them to automate their recovery from the upstream rebases).

There is a time limit on --fork-point, because it uses reflog entries, and reflog entries expire. If you have not reconfigured things, git defaults to expiring the critical reflog entries after 30 days, so if you do this, everyone else has about a month to recover from it.

这篇关于GIT:如何重新设置嵌套分支的基准?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆