Git接收后部署在随机点停止工作 [英] Git post-receive deployment stops working at random points

查看:112
本文介绍了Git接收后部署在随机点停止工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个git的接收后钩子设置,它基于分支签出到dev/staging/production.由于某种原因,开发和暂存工作没有问题.但是生产一直在中断.推送完master分支后,尽管最初安装后仍可以工作,但更新仍无法检出到正确的位置.

#!/bin/bash
while read oldrev newrev refname
do
    branch=$(git rev-parse --symbolic --abbrev-ref $refname)
    if [ "master" == "$branch" ]; then
        GIT_WORK_TREE=/var/www/production git checkout -f $branch
    elif [ "staging" == "$branch" ]; then
        GIT_WORK_TREE=/var/www/staging git checkout -f $branch
    else
        GIT_WORK_TREE=/var/www/dev git checkout -f $branch
    fi
done

我尝试将master分支更改为一个名为production的分支,并且遇到相同的问题.最初可以正常运行,但由于无法解决的原因在一段时间后停止.

if语句之所以起作用,是因为在checkout语句下方添加触摸命令时,将在正确的目录中成功创建一个文件.这也排除了权限,因为这三个目录在这方面都是相同的.

如果任何人有任何想法,或者可以看到可能导致这种现象的东西,那就太好了!

解决方案

许多部署脚本中,数十亿 1 中也存在相同的错误.

问题在于Git有一个索引.

更准确地说,Git需要每个工作树都有一个索引. 2

一个裸存储库没有没有工作树,但是Git仍然具有索引-例如,在该裸存储库的文件index中找到一(1)个索引.这意味着您可以使用GIT_WORK_TREE或等效方法强制存在一(1)个工作树,并使用该索引将一个分支检出到该工作树中.

您的部署脚本与其他许多部署脚本一样,使用该索引来签出到三个不同工作树的三个不同分支.当Git相信索引并使用该索引对要检查到的每个分支的假定为一个单一工作树的结构进行最小更改时,事情就会出错.您将生产分支写入/var/www/production的工作树;然后使用保存在(单个)索引中的状态更新工作树,该索引正确描述了(单个)工作树中的内容,以从staging分支更新/var/www/staging中的另一个工作树,因此,Git会使用保存的知识并仅更改/var/www/staging中的内容,而仅更改必要的文件. :-)

治愈方法是做以下各种事情之一:

  • 将三个不同的工作树与三个不同的索引文件一起使用.然后,索引文件实际上将与工作树匹配,并且Git的进行最小更改"将生效.新的内置git worktree add 应该是实现此目的的一种好方法,尽管我没有对此做过尝试.从逻辑上讲,将receive.denyCurrentBranchupdateInstead模式设置为应该更新相应的工作树.这需要现代的Git. git worktree进入2.5版,在2.6版中进行了一些重要修复,此后修复了更多(尽管较小). note 2016年12月添加了,但即使在Git版本2.11.最终可能会选择它.

  • 或者,您可以在设置GIT_WORK_TREE的同时设置变量GIT_INDEX_FILE,并且只有三个单独的索引文件. Git将根据需要创建它们,因此这是您可以对现有部署脚本进行的最小更改:

    GIT_WORK_TREE=/var/www/production GIT_INDEX_FILE=$GIT_DIR/index.production \
        git checkout $branch
    

  • 或者,确保Git重建索引和/或工作树.如果删除整个工作树(或将Git指向空的工作树),则Git会注意到当前索引毫无用处.然后,它会重新检查所有内容.

最后一种方法比前两种方法要耗时很多,但是如果仔细地做的话,确实有一个优势.考虑一下Git更新文件时Web服务器发生了什么. Git查看索引以查看现在已签出的内容,并查看您提供给git checkout的内容以查看被签出的内容.假设必须更新文件index.htmlblah.htmlfoo.css. Git更改了其中之一,然后,您的Web服务器获得了一个新的连接...并在读取 new blah.html的同时读取了 old index.html./p>

会发生什么?谁知道?这里的重点是您的Web服务器看到不一致的快照.它可能不是非常非常不一致,而且不会持续很长时间,也许这不是问题,但是如果您想要真正可靠的软件,就可以避免使用它.本质上,您需要让Web服务器读取旧快照,直到完全准备好新快照为止,这可以通过冻结Web服务器或作为原子操作进行转换来完成.

现在考虑如果您的服务器执行此操作会发生什么情况:

newtree=/var/www/newtree.$$
oldtree=/var/www/production.$$
# neither of these trees should exist, but do this
# in case we had a crash or something that left them behind
rm -rf $newtree $oldtree
mkdir $newtree

# populate the new tree
GIT_WORK_TREE=$tmptree git checkout $branch

# freeze / terminate the server (may not need this
# depending on how clever the server is -- it needs
# to notice the changeover)
service httpd stop

# swap the new tree in and the old one out, quickly
# (this is just two easy rename operations)
mv /var/www/production $oldtree
mv $newtree /var/www/production

# unfreeze/resume the server
service httpd start

# finally, delete the old tree (this does not need to be fast)
rm -rf $oldtree

这为您提供了一个相对最短的时间来停止或冻结服务器(而不是完全杀死/停止它,您可以向它发送通知,告知其目录已更改,然后等待几秒钟)进行切换).代价是您必须暂时拥有旧树和新树,并且设置新树比交换掉几个文件要花费更长的时间.

偶然地

此:

branch=$(git rev-parse --symbolic --abbrev-ref $refname)

有点误导,因为$refname不一定完全是分支.它可以是refs/heads/master(这是分支,master)或refs/tags/v1.2(这不是分支,它是标签)或refs/notes/commits(既不是分支也不是标签).在这里已经足够好了,但是这样做可能更明智:

case $refname in
refs/heads/production) deploy production;;
refs/heads/staging) deploy staging;;
refs/heads/dev) deploy dev;;
*) ;; # do nothing
esac

其中,deploy是将命名分支($1)部署到/var/www/$1的Shell函数.否则,您将重新部署dev以便推送到master并创建标记.


1 RIP CES,尽管实际上他从未这么说.

2 每个工作树还有一个HEAD,git worktree在这里也可以正确管理,尽管我从未在部署脚本中实际尝试过.我不是100%肯定如果部署的分支指向 same 提交ID会发生什么:我使用的工作流程通常指示无论如何都不会发生,因此git checkout <branch>是始终在移动HEAD.移动HEAD保证git checkout会做一些工作.用两个指向相同提交ID的分支测试分离索引,shared-HEAD方法可能会很有趣,以了解发生了什么.

无论如何,对单个HEAD进行大惊小怪的一个副作用是,新克隆将签出不同的默认分支(因为默认分支是由来源的HEAD决定的.)

I have a post-receive hook setup for git which checks out to dev/staging/production based on the branch. For some reason, dev and staging have worked without issue. But production keeps breaking. After pushing the master branch the updates fail to be checked out to the correct location, despite working after initially being setup.

#!/bin/bash
while read oldrev newrev refname
do
    branch=$(git rev-parse --symbolic --abbrev-ref $refname)
    if [ "master" == "$branch" ]; then
        GIT_WORK_TREE=/var/www/production git checkout -f $branch
    elif [ "staging" == "$branch" ]; then
        GIT_WORK_TREE=/var/www/staging git checkout -f $branch
    else
        GIT_WORK_TREE=/var/www/dev git checkout -f $branch
    fi
done

I have tried changing the master branch to a branch called production and have the same issue. Works initially and stops after a period of time for reasons I can't work out.

The if statement is working because when adding a touch command below the checkout statement, a file is created successfully in the correct directory. Which also rules out permissions, as all 3 directories are the same in that respect.

If anyone has any ideas, or can see something that could be causing this behaviour, then that would be great!

解决方案

This same bug is present in billions and billions1 many deployment scripts.

The problem is that Git has an index.

More precisely, Git needs an index per work-tree.2

A bare repository has no work-tree, but Git still has an index—as in, one (1) index, found in the file index in that bare repository. This means you can force the existence of one (1) work-tree using GIT_WORK_TREE or equivalent, and check out one branch into that one work-tree using that one index.

Your deployment script, like so many others, uses that one index to check out three different branches to three different work-trees. Things go wrong when Git believes the index and uses that to construct a minimal change to the assumed-to-be-one-single-work-tree you're checking each branch out into. You write the production branch to the work-tree at /var/www/production; then you update the work-tree, using the state saved in the (single) index, which describes correctly what's in the (single) work-tree, to update a different work-tree in /var/www/staging from the staging branch, so Git changes only the necessary files, using its saved knowledge and believing that that is what's in /var/www/staging ... well, you get the idea. :-)

The cure is to do one these various things:

  • Use three different work-trees with three different index files. Then the index file will in fact match the work-tree and Git's "make a minimal change" will work out. The new built-in git worktree add should be a good way to do this, though I have not experimented with this. Logically, setting the updateInstead mode of receive.denyCurrentBranch should update the appropriate work-tree. This requires a modern-ish Git; git worktree went into 2.5, had some important fixes in 2.6, and has had more, albeit smaller, fixes since then. note added Dec 2016 but it doesn't actually work even in Git version 2.11. It may eventually be made an option.

  • Or, you can set the variable GIT_INDEX_FILE at the same time you set GIT_WORK_TREE, and just have three separate index files. Git will create them as needed, so this is the smallest change you can make to your existing deployment script:

    GIT_WORK_TREE=/var/www/production GIT_INDEX_FILE=$GIT_DIR/index.production \
        git checkout $branch
    

  • Or, make sure Git rebuilds the index and/or work-tree. If you remove the entire work-tree (or point Git at an empty work-tree), Git notices that the current index is worthless. It then checks everything out afresh.

The last method is considerably more time-consuming than the first two, but does have an advantage, if you do it carefully. Consider what happens to your web server while Git is updating files. Git looks at the index to see what is checked-out now, and looks at what you gave to git checkout to see what should be checked-out. Let's say files index.html, blah.html, and foo.css must be updated. Git changes one of them, and just then, your web server gets a new connection ... and reads the old index.html while reading the new blah.html.

What happens? Who knows? The point here is that your web server sees an inconsistent snapshot. It's probably not very inconsistent, and not for long, and maybe it's not a problem, but if you want really reliable software you might want to avoid it. Essentially, you need to have the web server read the old snapshot until the new snapshot is completely ready to go, which you can do by either freezing the web server, or doing the changeover as an atomic operation.

Now consider what happens if you have your server do this:

newtree=/var/www/newtree.$$
oldtree=/var/www/production.$$
# neither of these trees should exist, but do this
# in case we had a crash or something that left them behind
rm -rf $newtree $oldtree
mkdir $newtree

# populate the new tree
GIT_WORK_TREE=$tmptree git checkout $branch

# freeze / terminate the server (may not need this
# depending on how clever the server is -- it needs
# to notice the changeover)
service httpd stop

# swap the new tree in and the old one out, quickly
# (this is just two easy rename operations)
mv /var/www/production $oldtree
mv $newtree /var/www/production

# unfreeze/resume the server
service httpd start

# finally, delete the old tree (this does not need to be fast)
rm -rf $oldtree

This gives you a relatively minimal time during which the server is stopped or frozen (and instead of killing/stopping it completely, you might be able to just send it a notice that its directory has changed, and then wait a few seconds for it to switch over). The cost is that you must temporarily have both old and new trees, and setting up the new tree takes longer than swapping out just a few files.

Incidentally

This:

branch=$(git rev-parse --symbolic --abbrev-ref $refname)

is a bit misleading, because $refname is not necessarily a branch at all. It may be refs/heads/master (which is a branch, master) or refs/tags/v1.2 (which is not a branch—it's a tag) or refs/notes/commits (which is neither a branch nor a tag). It's good enough here, but it might be wiser to do:

case $refname in
refs/heads/production) deploy production;;
refs/heads/staging) deploy staging;;
refs/heads/dev) deploy dev;;
*) ;; # do nothing
esac

where deploy is a shell function that deploys the named branch ($1) to /var/www/$1. Otherwise you're re-deploying dev for pushes to master and for tag creations.


1RIP CES, although actually he never said that.

2There's also one HEAD per work-tree, which git worktree would also manage correctly here, though again I've never actually tried this in a deployment script. I am not 100% sure what happens if the deployed branches ever point to the same commit ID: the work-flows I have used generally dictate that that can't happen anyway, so that git checkout <branch> is always moving HEAD. Moving HEAD guarantees that git checkout will do some work. It might be interesting to test the separate-index, shared-HEAD method with two branches pointing to the same commit ID, to see what happens.

In any case, one side effect of fussing with a single HEAD is that new clones will check out different default branches (since the default branch is determined by origin's HEAD).

这篇关于Git接收后部署在随机点停止工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆