Google Cloud上的Kubernetes 1.7:FailedSync同步错误窗格,SandboxChanged Pod沙盒已更改,它将被终止并重新创建 [英] Kubernetes 1.7 on Google Cloud: FailedSync Error syncing pod, SandboxChanged Pod sandbox changed, it will be killed and re-created

查看:1637
本文介绍了Google Cloud上的Kubernetes 1.7:FailedSync同步错误窗格,SandboxChanged Pod沙盒已更改,它将被终止并重新创建的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的Kubernetes豆荚和容器没有启动。他们陷入状态 ContainerCreating 中。



我运行命令 kubectl describe po PODNAME ,其中列出了事件,并且看到以下错误:

 类型原因消息
警告FailedSync同步错误pod
正常SandboxChanged Pod沙箱已更改,它将被终止并重新启动-created。

Count 列表示这些错误是一遍又一遍地重复,大概每秒一次。下面是来自这个命令的完整输出,但是我怎么去调试呢?

 名称:ocr-extra-2939512459-3hkv1 
命名空间: ocr-da-cluster
节点:gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 / 10.240.0.11
开始时间:2017年10月24日星期二21:05: 01 -0400
标签:component = ocr
pod-template-hash = 2939512459
role = extra
注解:kubernetes.io/created-by={\"kind: SerializedReference apiVersion: V1, 参考:{ 种: ReplicaSet, 命名空间: OCR-DA-集群, 名: OCR-EXTRA-2939512459, UID :d58bd050-b8f3-11e7-9f9e-4201 ...
状态:待定
IP:
创建人:ReplicaSet / ocr-extra-2939512459
受控者:ReplicaSet / ocr-extra-2939512459
容器:
ocr-node:
容器ID:
图片:us.gcr.io/ocr-api/ocr-image
图像ID:
端口:80 / TCP,443 / TCP,5555 / TCP,15672 / TCP,25672 / TCP,4369 / TCP,11211 / TCP
状态:等待
原因:ContainerCreating
准备好:False
重新启动计数:0
请求:
cpu:31
内存: 10Gi
Liveness:http-get http:// http / ocr / live delay = 270s timeout = 30s period = 60s#success = 1#failure = 5
准备状态:http-get http:// :http / _ah / warmup delay = 180s timeout = 60s period = 120s#success = 1#failure = 3
Environment:
NAMESPACE:ocr-da-cluster(v1:metadata.namespace)
挂载:apachelog(rw)
/ var / log / apache2从cellog(rw)
/ var / log / celery
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5(ro)
log-apache2-error:
容器ID:
图片:busybox
图像ID:
端口:<无>
参数:
/ bin / sh
-c
echo Apache2 Error&&睡90&& tail -n + 1 -F /var/log/apache2/error.log
状态:等待
原因:ContainerCreating
准备好:False
重新启动次数:0
请求:
cpu:20m
环境:<无>
装载:apachelog(ro)
/ var / log / apache2
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5(ro)
log-worker-1:
容器ID:
图片:busybox
图像ID:
端口:<无>
参数:
/ bin / sh
-c
回应Celery Worker&&睡90&& tail -n + 1 -F /var/log/celery/worker*.log
状态:等待
原因:容器创建
准备好:假
重新启动次数:0
请求:
cpu:20m
环境:<无>
Mounts:
/ var / log / celery from cellog(ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5(ro)
条件:
类型状态
初始化真
准备假
PodScheduled真
卷:
apachelog:
类型:EmptyDir(临时目录共享一个pod的生命周期)
中:
cellog:
类型:EmptyDir(共享一个pod的生命周期的临时目录)
中:
default-token-dhjr5:
类型:秘密(由秘密填充的卷)
秘密名称:默认令牌-dhjr5
可选:false
QoS类别:可爆发
节点选择器:beta .kubernetes.io / instance-type = n1-highcpu-32
公差:node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
活动:
FirstSeen LastSeen Count From SubObjectPath类型原因消息
--------- -------- ----- ---- ------------- --- ----- ------ -------
10m 10m 2 default-scheduler警告FailedScheduling没有节点可用于匹配以下所有谓词:: cpu不足(10),Insufficient内存(2),MatchNodeSelector(2)。
10m 10m 1默认计划程序正常计划已成功将ocr-extra-2939512459-3hkv1分配给gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2
10m 10m 1 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2正常SuccessfulMountVolume MountVolume.SetUp已成功执行卷apachelog
10m 10m 1 kubelet,gke-da-ocr-api-gce-cluster -extra-pool-65029b63-6qs2正常SuccessfulMountVolume MountVolume.SetUp成功执行卷cellog
10m 10m 1 kubelet,gke-da-ocr-api -gce-cluster-extra-pool-65029b63-6qs2正常SuccessfulMountVolume MountVolume .SetUp成功的音量为default-token-dhjr5
10m 1s 382 kubelet,gke-da-ocr-api-gce-cluster-ext ra-pool-65029b63-6qs2警告FailedSync同步错误pod
10m 0s 382 kubelet,gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2正常SandboxChanged Pod沙箱已更改,它将被终止并重新创建。


解决方案

检查资源限制。我面临同样的问题,我的理由是因为我使用 m 而不是 Mi 来限制内存和内存请求。


My Kubernetes pods and containers are not starting. They are stuck in with the status ContainerCreating.

I ran the command kubectl describe po PODNAME, which lists the events and I see the following error:

Type        Reason            Message
Warning     FailedSync        Error syncing pod
Normal      SandboxChanged    Pod sandbox changed, it will be killed and re-created.

The Count column indicates that these errors are being repeated over and over again, roughly once a second. The full output is below from this command is below, but how do I go about debugging this? I'm not even sure what these errors mean.

Name:           ocr-extra-2939512459-3hkv1
Namespace:      ocr-da-cluster
Node:           gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2/10.240.0.11
Start Time:     Tue, 24 Oct 2017 21:05:01 -0400
Labels:         component=ocr
                pod-template-hash=2939512459
                role=extra
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"ocr-da-cluster","name":"ocr-extra-2939512459","uid":"d58bd050-b8f3-11e7-9f9e-4201...
Status:         Pending
IP:
Created By:     ReplicaSet/ocr-extra-2939512459
Controlled By:  ReplicaSet/ocr-extra-2939512459
Containers:
  ocr-node:
    Container ID:
    Image:              us.gcr.io/ocr-api/ocr-image
    Image ID:
    Ports:              80/TCP, 443/TCP, 5555/TCP, 15672/TCP, 25672/TCP, 4369/TCP, 11211/TCP
    State:              Waiting
      Reason:           ContainerCreating
    Ready:              False
    Restart Count:      0
    Requests:
      cpu:      31
      memory:   10Gi
    Liveness:   http-get http://:http/ocr/live delay=270s timeout=30s period=60s #success=1 #failure=5
    Readiness:  http-get http://:http/_ah/warmup delay=180s timeout=60s period=120s #success=1 #failure=3
    Environment:
      NAMESPACE:        ocr-da-cluster (v1:metadata.namespace)
    Mounts:
      /var/log/apache2 from apachelog (rw)
      /var/log/celery from cellog (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
  log-apache2-error:
    Container ID:
    Image:              busybox
    Image ID:
    Port:               <none>
    Args:
      /bin/sh
      -c
      echo Apache2 Error && sleep 90 && tail -n+1 -F /var/log/apache2/error.log
    State:              Waiting
      Reason:           ContainerCreating
    Ready:              False
    Restart Count:      0
    Requests:
      cpu:              20m
    Environment:        <none>
    Mounts:
      /var/log/apache2 from apachelog (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
  log-worker-1:
    Container ID:
    Image:              busybox
    Image ID:
    Port:               <none>
    Args:
      /bin/sh
      -c
      echo Celery Worker && sleep 90 && tail -n+1 -F /var/log/celery/worker*.log
    State:              Waiting
      Reason:           ContainerCreating
    Ready:              False
    Restart Count:      0
    Requests:
      cpu:              20m
    Environment:        <none>
    Mounts:
      /var/log/celery from cellog (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
Conditions:
  Type          Status
  Initialized   True
  Ready         False
  PodScheduled  True
Volumes:
  apachelog:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  cellog:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  default-token-dhjr5:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-dhjr5
    Optional:   false
QoS Class:      Burstable
Node-Selectors: beta.kubernetes.io/instance-type=n1-highcpu-32
Tolerations:    node.alpha.kubernetes.io/notReady:NoExecute for 300s
                node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
  FirstSeen     LastSeen        Count   From                                                        SubObjectPath       Type            Reason                  Message
  ---------     --------        -----   ----                                                        -------------       --------        ------                  -------
  10m           10m             2       default-scheduler                                                       Warning         FailedScheduling        No nodes are available that match all of the following predicates:: Insufficient cpu (10), Insufficient memory (2), MatchNodeSelector (2).
  10m           10m             1       default-scheduler                                                       Normal          Scheduled               Successfully assigned ocr-extra-2939512459-3hkv1 to gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2
  10m           10m             1       kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Normal          SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "apachelog"
  10m           10m             1       kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Normal          SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "cellog"
  10m           10m             1       kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Normal          SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "default-token-dhjr5"
  10m           1s              382     kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Warning         FailedSync              Error syncing pod
  10m           0s              382     kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2                    Normal          SandboxChanged          Pod sandbox changed, it will be killed and re-created.

解决方案

Check your resource limits. I faced the same issue and the reason in my case was because I was using m instead of Mi for memory limits and memory requests.

这篇关于Google Cloud上的Kubernetes 1.7:FailedSync同步错误窗格,SandboxChanged Pod沙盒已更改,它将被终止并重新创建的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆