Google Cloud上的Kubernetes 1.7:FailedSync同步错误窗格,SandboxChanged Pod沙盒已更改,它将被终止并重新创建 [英] Kubernetes 1.7 on Google Cloud: FailedSync Error syncing pod, SandboxChanged Pod sandbox changed, it will be killed and re-created
问题描述
ContainerCreating
中。 我运行命令 kubectl describe po PODNAME
,其中列出了事件,并且看到以下错误:
类型原因消息
警告FailedSync同步错误pod
正常SandboxChanged Pod沙箱已更改,它将被终止并重新启动-created。
名称:ocr-extra-2939512459-3hkv1
命名空间: ocr-da-cluster
节点:gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 / 10.240.0.11
开始时间:2017年10月24日星期二21:05: 01 -0400
标签:component = ocr
pod-template-hash = 2939512459
role = extra
注解:kubernetes.io/created-by={\"kind: SerializedReference apiVersion: V1, 参考:{ 种: ReplicaSet, 命名空间: OCR-DA-集群, 名: OCR-EXTRA-2939512459, UID :d58bd050-b8f3-11e7-9f9e-4201 ...
状态:待定
IP:
创建人:ReplicaSet / ocr-extra-2939512459
受控者:ReplicaSet / ocr-extra-2939512459
容器:
ocr-node:
容器ID:
图片:us.gcr.io/ocr-api/ocr-image
图像ID:
端口:80 / TCP,443 / TCP,5555 / TCP,15672 / TCP,25672 / TCP,4369 / TCP,11211 / TCP
状态:等待
原因:ContainerCreating
准备好:False
重新启动计数:0
请求:
cpu:31
内存: 10Gi
Liveness:http-get http:// http / ocr / live delay = 270s timeout = 30s period = 60s#success = 1#failure = 5
准备状态:http-get http:// :http / _ah / warmup delay = 180s timeout = 60s period = 120s#success = 1#failure = 3
Environment:
NAMESPACE:ocr-da-cluster(v1:metadata.namespace)
挂载:apachelog(rw)
/ var / log / apache2从cellog(rw)
/ var / log / celery
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5(ro)
log-apache2-error:
容器ID:
图片:busybox
图像ID:
端口:<无>
参数:
/ bin / sh
-c
echo Apache2 Error&&睡90&& tail -n + 1 -F /var/log/apache2/error.log
状态:等待
原因:ContainerCreating
准备好:False
重新启动次数:0
请求:
cpu:20m
环境:<无>
装载:apachelog(ro)
/ var / log / apache2
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5(ro)
log-worker-1:
容器ID:
图片:busybox
图像ID:
端口:<无>
参数:
/ bin / sh
-c
回应Celery Worker&&睡90&& tail -n + 1 -F /var/log/celery/worker*.log
状态:等待
原因:容器创建
准备好:假
重新启动次数:0
请求:
cpu:20m
环境:<无>
Mounts:
/ var / log / celery from cellog(ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5(ro)
条件:
类型状态
初始化真
准备假
PodScheduled真
卷:
apachelog:
类型:EmptyDir(临时目录共享一个pod的生命周期)
中:
cellog:
类型:EmptyDir(共享一个pod的生命周期的临时目录)
中:
default-token-dhjr5:
类型:秘密(由秘密填充的卷)
秘密名称:默认令牌-dhjr5
可选:false
QoS类别:可爆发
节点选择器:beta .kubernetes.io / instance-type = n1-highcpu-32
公差:node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
活动:
FirstSeen LastSeen Count From SubObjectPath类型原因消息
--------- -------- ----- ---- ------------- --- ----- ------ -------
10m 10m 2 default-scheduler警告FailedScheduling没有节点可用于匹配以下所有谓词:: cpu不足(10),Insufficient内存(2),MatchNodeSelector(2)。
10m 10m 1默认计划程序正常计划已成功将ocr-extra-2939512459-3hkv1分配给gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2
10m 10m 1 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2正常SuccessfulMountVolume MountVolume.SetUp已成功执行卷apachelog
10m 10m 1 kubelet,gke-da-ocr-api-gce-cluster -extra-pool-65029b63-6qs2正常SuccessfulMountVolume MountVolume.SetUp成功执行卷cellog
10m 10m 1 kubelet,gke-da-ocr-api -gce-cluster-extra-pool-65029b63-6qs2正常SuccessfulMountVolume MountVolume .SetUp成功的音量为default-token-dhjr5
10m 1s 382 kubelet,gke-da-ocr-api-gce-cluster-ext ra-pool-65029b63-6qs2警告FailedSync同步错误pod
10m 0s 382 kubelet,gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2正常SandboxChanged Pod沙箱已更改,它将被终止并重新创建。
检查资源限制。我面临同样的问题,我的理由是因为我使用 m
而不是 Mi
来限制内存和内存请求。
My Kubernetes pods and containers are not starting. They are stuck in with the status ContainerCreating
.
I ran the command kubectl describe po PODNAME
, which lists the events and I see the following error:
Type Reason Message
Warning FailedSync Error syncing pod
Normal SandboxChanged Pod sandbox changed, it will be killed and re-created.
The Count
column indicates that these errors are being repeated over and over again, roughly once a second. The full output is below from this command is below, but how do I go about debugging this? I'm not even sure what these errors mean.
Name: ocr-extra-2939512459-3hkv1
Namespace: ocr-da-cluster
Node: gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2/10.240.0.11
Start Time: Tue, 24 Oct 2017 21:05:01 -0400
Labels: component=ocr
pod-template-hash=2939512459
role=extra
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"ocr-da-cluster","name":"ocr-extra-2939512459","uid":"d58bd050-b8f3-11e7-9f9e-4201...
Status: Pending
IP:
Created By: ReplicaSet/ocr-extra-2939512459
Controlled By: ReplicaSet/ocr-extra-2939512459
Containers:
ocr-node:
Container ID:
Image: us.gcr.io/ocr-api/ocr-image
Image ID:
Ports: 80/TCP, 443/TCP, 5555/TCP, 15672/TCP, 25672/TCP, 4369/TCP, 11211/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 31
memory: 10Gi
Liveness: http-get http://:http/ocr/live delay=270s timeout=30s period=60s #success=1 #failure=5
Readiness: http-get http://:http/_ah/warmup delay=180s timeout=60s period=120s #success=1 #failure=3
Environment:
NAMESPACE: ocr-da-cluster (v1:metadata.namespace)
Mounts:
/var/log/apache2 from apachelog (rw)
/var/log/celery from cellog (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
log-apache2-error:
Container ID:
Image: busybox
Image ID:
Port: <none>
Args:
/bin/sh
-c
echo Apache2 Error && sleep 90 && tail -n+1 -F /var/log/apache2/error.log
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 20m
Environment: <none>
Mounts:
/var/log/apache2 from apachelog (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
log-worker-1:
Container ID:
Image: busybox
Image ID:
Port: <none>
Args:
/bin/sh
-c
echo Celery Worker && sleep 90 && tail -n+1 -F /var/log/celery/worker*.log
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Requests:
cpu: 20m
Environment: <none>
Mounts:
/var/log/celery from cellog (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-dhjr5 (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
apachelog:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
cellog:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-dhjr5:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-dhjr5
Optional: false
QoS Class: Burstable
Node-Selectors: beta.kubernetes.io/instance-type=n1-highcpu-32
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
10m 10m 2 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: Insufficient cpu (10), Insufficient memory (2), MatchNodeSelector (2).
10m 10m 1 default-scheduler Normal Scheduled Successfully assigned ocr-extra-2939512459-3hkv1 to gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2
10m 10m 1 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "apachelog"
10m 10m 1 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "cellog"
10m 10m 1 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "default-token-dhjr5"
10m 1s 382 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Warning FailedSync Error syncing pod
10m 0s 382 kubelet, gke-da-ocr-api-gce-cluster-extra-pool-65029b63-6qs2 Normal SandboxChanged Pod sandbox changed, it will be killed and re-created.
Check your resource limits. I faced the same issue and the reason in my case was because I was using m
instead of Mi
for memory limits and memory requests.
这篇关于Google Cloud上的Kubernetes 1.7:FailedSync同步错误窗格,SandboxChanged Pod沙盒已更改,它将被终止并重新创建的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!