有时,将在没有网络的情况下创建吊舱,这会导致吊舱通过CrashLoopBackOff反复失败 [英] Occasionally pods will be created with no network which results in the pod failing repeatedly with CrashLoopBackOff

查看:114
本文介绍了有时,将在没有网络的情况下创建吊舱,这会导致吊舱通过CrashLoopBackOff反复失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有时,我会看到一个问题,即在没有网络连接的情况下Pod会启动.因此,该Pod进入CrashLoopBackOff并且无法恢复.我能够再次运行Pod的唯一方法是运行kubectl delete pod并等待其重新安排.这是由于此问题导致活动性探针失败的示例:

Occasionally, I see an issue where a pod will start up without network connectivity. Because of this, the pod goes into a CrashLoopBackOff and is unable to recover. The only way I am able to get the pod running again is by running a kubectl delete pod and waiting for it to reschedule. Here's an example of a liveness probe failing due to this issue:

Liveness probe failed: Get http://172.20.78.9:9411/health: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

我还注意到,发生这种情况时,没有Pod IP的iptables条目.当吊舱被删除并重新安排(并且处于工作状态)时,我有了iptables条目.

I've also noticed that there are no iptables entries for the pod IP when this happens. When the pod is deleted and rescheduled (and is in a working state) I have the iptables entries.

如果我关闭容器中的livenessprobe并执行它,我确认它没有与群集或本地网络或Internet的网络连接.

If I turn off the livenessprobe in the container and exec into it, I confirmed it has no network connectivity to the cluster or the local network or internet.

想听听关于它可能是什么或者我可以研究进一步解决此情况的任何建议.

Would like to hear any suggestions as to what it could be or what else I can look into to further troubleshoot this scenario.

当前正在运行:

Kubernetes版本:

Kubernetes version:

Client Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.7",
GitCommit:"92b4f971662de9d8770f8dcd2ee01ec226a6f6c0", 
GitTreeState:"clean", BuildDate:"2016-12-10T04:49:33Z", 
GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

Server Version: version.Info{Major:"1", Minor:"4", GitVersion:"v1.4.7",  
GitCommit:"92b4f971662de9d8770f8dcd2ee01ec226a6f6c0", 
GitTreeState:"clean", BuildDate:"2016-12-10T04:43:42Z", 
GoVersion:"go1.6.3", Compiler:"gc", Platform:"linux/amd64"}

操作系统:

NAME=CoreOS
ID=coreos
VERSION=1235.0.0
VERSION_ID=1235.0.0
BUILD_ID=2016-11-17-0416
PRETTY_NAME="CoreOS 1235.0.0 (MoreOS)"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"

推荐答案

好像您的网络驱动程序无法正常工作.没有有关您的设置的更多信息,我只能向您提出以下建议:

Looks like your network driver is not working properly. Without more information about your setup, I could only suggest you the following:

  1. 找出使用了哪些网络驱动程序?您可以通过检查kubelet --network-plugin标志来判断.如果未指定网络插件,则说明它使用的是本地docker网络.
  2. 给出网络驱动程序,检查Pod网络设置,并查看缺少的内容.使用tcpdump查看数据包的去向.
  1. Find out what network driver was used? You can tell by checking kubelet --network-plugin flag. If no network plugin is specified, then it is using native docker network.
  2. Given the network driver, examine the pod network setup and see what is missing. Use tcpdump to see where the packet goes.

这篇关于有时,将在没有网络的情况下创建吊舱,这会导致吊舱通过CrashLoopBackOff反复失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆