是否可以使用externalTrafficPolicy:Local在GKE上进行无中断滚动更新? [英] Are hitless rolling updates possible on GKE with externalTrafficPolicy: Local?

查看:106
本文介绍了是否可以使用externalTrafficPolicy:Local在GKE上进行无中断滚动更新?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个GKE集群(1.12.10-gke.17).

I have a GKE cluster (1.12.10-gke.17).

我正在使用type: LoadBalancer运行 nginx-ingress-controller .

我已将externalTrafficPolicy: Local设置为保留源IP .

除滚动更新期间外,其他所有功能都非常有效.我有maxSurge: 1maxUnavailable: 0.

Everything works great, except during rolling updates. I have maxSurge: 1 and maxUnavailable: 0.

我的问题是,在滚动更新期间,我开始收到请求超时.我怀疑Google负载平衡器仍在将请求发送到Pod为Terminating的节点,即使运行状况检查失败也是如此.当广告连播从Running更改为Terminating时,此操作大约会开始30-60s.一段时间后,一切都会稳定下来,流量最终只会流到具有新Pod的新节点.

My problem is that during a rolling update, I start getting request timeouts. I suspect the Google load balancer is still sending requests to the node where the pod is Terminating even though the health checks are failing. This happens for about 30-60s starting right when the pod changes from Running to Terminating. Everything stabilizes after a while and traffic eventually goes only to the new node with the new pod.

如果负载均衡器 停止向终端Pod发送请求的速度很慢,是否有某种方法可以使这些滚动部署变得顺畅?

If the load balancer is slow to stop sending requests to a terminating pod, is there some way to make these rolling deploys hitless?

我的理解是,在 normal k8s服务中,externalTrafficPolicy并不正常,Google负载均衡器只是将请求发送到所有节点,然后让iptables将其排序.当Pod为Terminating时,iptables会快速更新,流量不再发送到该Pod.但是,在externalTrafficPolicyLocal的情况下,如果接收请求的节点没有Running pod,则请求超时,这就是这里发生的情况.

My understanding is that in a normal k8s service, where externalTrafficPolicy is not normal, the Google load balancer simply sends requests to all nodes and let's the iptables sort it out. When a pod is Terminating the iptables are updated quickly and traffic does not get sent to that pod anymore. In the case where externalTrafficPolicy is Local however, if the node that receives the request does not have a Running pod, then the request times out, which is what is happening here.

如果这是正确的,那么我只会看到两个选择

If this is correct, then I only see two options

  1. 停止使用Terminating pod将该请求发送到节点
  2. 即使吊舱是Terminating
  3. ,也继续提供服务请求
  1. stop sending requests to the node with a Terminating pod
  2. continue servicing requests even though the pod is Terminating

我觉得选项1很难,因为它需要通知负载均衡器pod即将开始Terminating.

I feel like option 1 is difficult since it requires informing the load balancer that the pod is about to start Terminating.

我在选项2上取得了一些进展,但到目前为止还没有奏效.我设法通过添加仅运行sleep 60preStop生命周期钩子来继续处理来自Pod的请求,但是我认为问题在于healthCheckNodePort报告localEndpoints: 0,我怀疑在到达节点并到达吊舱.也许localEndpoints: 0时iptables没有路由.

I've made some progress on option 2, but so far haven't gotten it working. I've managed to continue serving requests from the pod by adding a preStop lifecycle hook which just runs sleep 60, but I think the problem is that the healthCheckNodePort reports localEndpoints: 0 and I suspect something is blocking the request between arriving at the node and getting to the pod. Perhaps, the iptables aren't routing when localEndpoints: 0.

我还将Google负载均衡器的运行状况检查(与readinessProbelivenessProbe不同)调整为可能的最快"设置,例如1s间隔,1个故障阈值,我已经验证了负载均衡器后端aka k8s节点确实确实无法快速通过运行状况检查,但无论如何仍会继续向终止Pod发送请求.

I've also adjusted the Google load balancer health check, which is different from the readinessProbe and livenessProbe, to the "fastest" settings possible e.g. 1s interval, 1 failure threshold and I've verified that the load balancer backend aka k8s node, indeed fails health checks quickly, but continues to send requests to the terminating pod anyway.

推荐答案

有类似的讨论

There is a similar discussion here. Although it's not identical, it's a similar use case.

一切听起来都按预期进行.

Everything sounds like it is working as expected.

  • LoadBalancer将根据LoadBalancer运行状况检查将流量发送到任何运行正常的节点. LoadBalancer不了解单个吊舱.

  • The LoadBalancer will send traffic to any healthy node based on the LoadBalancer health check. The LoadBalancer is unaware of individual pods.

一旦超过健康检查阈值,则健康检查会将节点标记为不健康,即,每x秒发送HC并发送x超时延迟,x失败请求数.这会导致豆荚终止与标记为不健康之间存在延迟.

The health check will mark a node as unhealthy once the health check threshold is crossed, ie HC is sent every x seconds with x timeout delay, x number of failed requests. This causes a delay between the time that the pod goes into terminating and it is marked as unhealthy.

还请注意,一旦将Pod标记为notReady,就将Pod从服务端点中删除.如果节点上没有其他Pod,流量将继续到达此节点(由于上述HC行为),由于externalTrafficPolicy(流量保留在发送它的节点上),因此无法转发请求./p>

Also note that once the pod is marked as notReady, the pod is removed from the service endpoint. If there is no other pod on a node, traffic will continue reaching this node (because of the HC behaviour explained above), the requests can't be forwarded because of the externalTrafficPolicy (traffic remains on the node where it was sent).

有两种解决方法.

  1. 要最大程度地缩短终止的pod和标记为不正常的节点之间的时间,可以设置更积极的运行状况检查.问题在于,过于敏感的HC可能会导致误报,通常会增加节点的开销(其他运行状况检查请求),并且无法完全消除失败的请求.

  1. To minimize the amount of time between a terminated pod and the node being marked as unhealthy, you can set a more aggressive health check. The trouble with this is that an overly sensitive HC may cause false positives, usually increases the overhead on the node (additional health check requests), and it will not fully eliminate the failed requests.

具有足够的运行中的Pod,这样每个节点始终至少有2个Pod.由于服务一旦进入notReady便从端点删除了Pod,因此请求将直接发送到正在运行的Pod.不利的一面是您要么会有额外的开销(更多的Pod),要么是更紧密的分组(更容易出现故障).它也不会完全消除失败的请求,但是数量很少.

Have enough pods running so that there are always at least 2 pods per node. Since the service removes the pod from the endpoint once it goes into notReady, requests will just get sent to the running pod instead. The downside here is that you will either have additional overhead (more pods) or a tighter grouping (more vulnerable to failure). It also won't fully eliminate the failed requests, but they will be incredibly few.

调整HC和您的容器以协同工作: 3a.将HC端点与您使用的正常路径分开. 3b.将容器readinessProbe配置为匹配您的容器在其上提供流量的主路径(它将与LB HC路径不同) 3c.配置映像,以便在接收到SIGTERM时,首先要考虑的是HC路径. 3d.将映像配置为在接收到SIGTERM之后优雅地耗尽所有连接,而不是立即关闭所有会话和连接.

Tweak the HC and your container to work together: 3a.Have the HC endpoint be separate from the normal path you use. 3b. Configure the container readinessProbe to match the main path your container serves traffic on (it will be different from the LB HC path) 3c. Configure your image so that when SIGTERM is received, the first thing to go down is the HC path. 3d. Configure the image to gracefully drain all connections once a SIGTERM is received rather than immediately closing all sessions and connections.

这应该意味着正在进行的会话将正常终止,从而减少了错误. 这也应意味着即使节点已准备好服务正常流量,该节点也将开始HC探测失败,这给节点留出时间标记为不正常,并且LB将在无法再服务之前停止向其发送流量请求.

This should mean that ongoing sessions will gracefully terminate which reduces errors. It should also mean that the node will start failing HC probes even though it is ready to serve normal traffic, this gives time for the node to be marked as unhealthy and the LB will stop sending traffic to it before it is no longer able to serve requests.

最后一个选项的问题是2折.首先,配置更加复杂.另一个问题是,这意味着您的Pod需要更长的时间才能终止,所以滚动更新将花费更长的时间,依赖于优雅终止Pod的其他任何过程(例如耗尽节点)也将如此.除非您需要快速周转,否则第二个问题还不错.

The problem with this last option is 2 fold. First, it is more complex to configure. The other issue is that it means your pods will take longer to terminate so rolling updates will take longer, so will any other process that relies on gracefully terminating the pod such as draining the node. The second issue isn't too bad unless you are in need of a quick turn around.

这篇关于是否可以使用externalTrafficPolicy:Local在GKE上进行无中断滚动更新?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆