如果节点变为离线超时,Kubernetes将重新创建Pod [英] Kubernetes recreate pod if node becomes offline timeout

查看:122
本文介绍了如果节点变为离线超时,Kubernetes将重新创建Pod的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经开始使用docker镜像并设置Kubernetes.我已经解决了所有问题,但豆荚娱乐的超时时间有问题.

I've started working with the docker images and set up Kubernetes. I have fixed everything but I am having problems with the timeout of pod recreations.

如果一个Pod在一个特定节点上运行,并且我将其关闭,则大约需要5分钟才能在另一个在线节点上重新创建Pod.

If one pod is running on one particular node and if I shut it down, it will take ~5 minutes to recreate the pod on another online node.

我已经检查了所有可能的配置文件,还设置了所有pod pod-eviction-timeout,horizo​​ntal-pod-autoscaler-downscale,horizo​​ntal-pod-autoscaler-downscale-delay标志,但是它仍然无法正常工作.

I've checked all the possible config files, also set all pod-eviction-timeout, horizontal-pod-autoscaler-downscale, horizontal-pod-autoscaler-downscale-delay flags but it is still not working.

当前的kube控制器管理器配置:

Current kube controller manager config:

spec:
 containers:
 - command:
   - kube-controller-manager
   - --address=192.168.5.135
   - --allocate-node-cidrs=false
   - --authentication-kubeconfig=/etc/kubernetes/controller-manager.conf
   - --authorization-kubeconfig=/etc/kubernetes/controller-manager.conf
   - --client-ca-file=/etc/kubernetes/pki/ca.crt
   - --cluster-cidr=192.168.5.0/24
   - --cluster-signing-cert-file=/etc/kubernetes/pki/ca.crt
   - --cluster-signing-key-file=/etc/kubernetes/pki/ca.key
   - --controllers=*,bootstrapsigner,tokencleaner
   - --kubeconfig=/etc/kubernetes/controller-manager.conf
   - --leader-elect=true
   - --node-cidr-mask-size=24
   - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
   - --root-ca-file=/etc/kubernetes/pki/ca.crt
   - --service-account-private-key-file=/etc/kubernetes/pki/sa.key
   - --use-service-account-credentials=true
   - --horizontal-pod-autoscaler-downscale-delay=20s
   - --horizontal-pod-autoscaler-sync-period=20s
   - --node-monitor-grace-period=40s
   - --node-monitor-period=5s
   - --pod-eviction-timeout=20s
   - --use-service-account-credentials=true
   - --horizontal-pod-autoscaler-downscale-stabilization=20s
image: k8s.gcr.io/kube-controller-manager:v1.13.0

谢谢.

推荐答案

如果基于着色的驱逐,控制器管理器将无法逐出容忍异味的容器.即使您未在配置中定义驱逐策略,该策略也将成为默认策略,因为默认容忍秒数接纳控制器插件默认处于启用状态.

If Taint Based Evictions are present in the pod definition, controller manager will not be able to evict the pod that tolerates the taint. Even if you don't define an eviction policy in your configuration, it gets a default one since Default Toleration Seconds admission controller plugin is enabled by default.

默认容忍秒接纳控制器插件将您的吊舱配置如下:

Default Toleration Seconds admission controller plugin configures your pod like below:

tolerations:
- key: node.kubernetes.io/not-ready
  effect: NoExecute
  tolerationSeconds: 300
- key: node.kubernetes.io/unreachable
  operator: Exists
  effect: NoExecute
  tolerationSeconds: 300

您可以通过检查广告连播的定义来验证这一点:

You can verify this by inspecting definition of your pod:

kubectl get pods -o yaml -n <namespace> <pod-name>`

根据上述容忍度,在另一个准备就绪的节点上重新创建容器需要5分钟以上的时间,因为容器可以容忍not-ready污点长达5分钟.在这种情况下,即使将--pod-eviction-timeout设置为20s,由于公差,控制器管理器也无能为力.

According to above toleration it takes more than 5 minutes to recreate the pod on another ready node since pod can tolerate not-ready taint for up to 5 minutes. In this case, even if you set --pod-eviction-timeout to 20s, there is nothing controller manager can do because of the tolerations.

但是为什么要花5分钟以上呢?因为该节点在--node-monitor-grace-period之后将被视为已关闭,默认为40s.此后,豆荚容忍计时器启动.

But why it takes more than 5 minutes? Because the node will be considered as down after --node-monitor-grace-period which defaults to 40s. After that, pod toleration timer starts.

如果希望群集对节点中断做出更快的反应,则应使用污点和容差,而无需修改选项.例如,您可以按以下方式定义您的广告连播:

If you want your cluster to react faster for node outages, you should use taints and tolerations without modifying options. For example, you can define your pod like below:

tolerations:
- key: node.kubernetes.io/not-ready
  effect: NoExecute
  tolerationSeconds: 0
- key: node.kubernetes.io/unreachable
  effect: NoExecute
  tolerationSeconds: 0

在上述公差范围内,将在当前节点标记为未就绪之后,在就绪节点上重新创建Pod.由于--node-monitor-grace-period的默认值为40秒,因此此过程应少于一分钟.

With above toleration your pod will be recreated on a ready node just after the current node marked as not ready. This should take less then a minute since --node-monitor-grace-period is default to 40s.

如果您想在下面控制这些时间,则可以找到很多选择.但是,应避免修改这些选项.如果使用的时间太紧,可能会在etcd上造成额外的开销,因为每个节点都会非常频繁地尝试更新其状态.

If you want to be in control of these timings below you will find plenty of options to do so. However, modifying these options should be avoided. If you use tight timings which might create an overhead on etcd as every node will try to update its status very often.

除此之外,目前尚不清楚如何将控制器管理器,api服务器和kubelet配置中的更改传播到活动群集中的所有节点.请参阅更改集群的跟踪问题在活动集群中重新配置节点的kubelet 是在测试版中.

In addition to this, currently it is not clear how to propagate changes in controller manager, api server and kubelet configuration to all nodes in a living cluster. Please see Tracking issue for changing the cluster and Dynamic Kubelet Configuration. As of this writing, reconfiguring a node's kubelet in a live cluster is in beta.

您可以在kubeadm初始化或连接阶段配置控制平面和kubelet.请参考使用kubeadm自定义控制平面配置使用kubeadm配置集群中的每个kubelet 有关更多详细信息.

You can configure control plane and kubelet during kubeadm init or join phase. Please refer to Customizing control plane configuration with kubeadm and Configuring each kubelet in your cluster using kubeadm for more details.

假设您有一个单节点集群:

Assuming you have a single node cluster:

  • 控制器管理器包括:
    • --node-monitor-grace-period默认40秒
    • --node-monitor-period默认5秒
    • --pod-eviction-timeout默认5m0s
    • controller manager includes:
      • --node-monitor-grace-period default 40s
      • --node-monitor-period default 5s
      • --pod-eviction-timeout default 5m0s
      • --default-not-ready-toleration-seconds默认300
      • --default-unreachable-toleration-seconds默认300
      • --default-not-ready-toleration-seconds default 300
      • --default-unreachable-toleration-seconds default 300
      • --node-status-update-frequency默认10秒
      • --node-status-update-frequency default 10s

      如果使用kubeadm设置集群,则可以修改:

      If you set up the cluster with kubeadm you can modify:

      • /etc/kubernetes/manifests/kube-controller-manager.yaml用于控制器管理器选项.
      • /etc/kubernetes/manifests/kube-apiserver.yaml用于api服务器选项.
      • /etc/kubernetes/manifests/kube-controller-manager.yaml for controller manager options.
      • /etc/kubernetes/manifests/kube-apiserver.yaml for api server options.

      注意:修改这些文件将重新配置并重新启动节点中的相应Pod.

      Note: Modifying these files will reconfigure and restart the respective pod in the node.

      为了修改kubelet配置,您可以在以下行中添加:

      In order to modify kubelet config you can add below line:

      KUBELET_EXTRA_ARGS="--node-status-update-frequency=10s"
      

      /etc/default/kubelet(对于DEB)或/etc/sysconfig/kubelet(对于RPM),然后重新启动kubelet服务:

      To /etc/default/kubelet (for DEBs), or /etc/sysconfig/kubelet (for RPMs) and then restart kubelet service:

      sudo systemctl daemon-reload && sudo systemctl restart kubelet
      

      这篇关于如果节点变为离线超时,Kubernetes将重新创建Pod的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆