附加新卷(EKS)时Kubernetes Pod处于挂起状态 [英] Kubernetes pod pending when a new volume is attached (EKS)

查看:116
本文介绍了附加新卷(EKS)时Kubernetes Pod处于挂起状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我描述一下我的情况:

Let me describe my scenario:

当我在Kubernetes上创建一个具有1个附加卷的部署时,一切运行正常.当我创建相同的部署,但附加了第二个卷(总计:2个卷)时,吊舱卡在待处理"中,并显示以下错误:

When I create a deployment on Kubernetes with 1 attached volume, everything works perfectly. When I create the same deployment, but with a second volume attached (total: 2 volumes), the pod gets stuck on "Pending" with errors:

pod has unbound PersistentVolumeClaims (repeated 2 times)
0/2 nodes are available: 2 node(s) had no available volume zone.

已经检查了是否在正确的可用区中创建了卷.

Already checked that the volumes are created in the correct availability zones.

我有一个使用Amazon EKS设置的集群,其中有2个节点.我有以下默认存储类:

I have a cluster set up using Amazon EKS, with 2 nodes. I have the following default storage class:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: gp2
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Retain
mountOptions:
  - debug

我有一个mongodb部署,它需要两个卷,一个卷安装在/data/db文件夹上,另一个卷安装在我需要的某个随机目录中.这是用于创建这三个组件的最小Yaml(我故意在其中注释了几行):

And I have a mongodb deployment which needs two volumes, one mounted on /data/db folder, and the other mounted in some random directory I need. Here is an minimal yaml used to create the three components (I commented some lines on purpose):

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  namespace: my-project
  creationTimestamp: null
  labels:
    io.kompose.service: my-project-db-claim0
  name: my-project-db-claim0
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  namespace: my-project
  creationTimestamp: null
  labels:
    io.kompose.service: my-project-db-claim1
  name: my-project-db-claim1
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  namespace: my-project
  name: my-project-db
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        name: my-db
    spec:
      containers:
        - name: my-project-db-container
          image: mongo
          imagePullPolicy: Always
          resources: {}
          volumeMounts:
          - mountPath: /my_dir
            name: my-project-db-claim0
          # - mountPath: /data/db
          #   name: my-project-db-claim1
          ports:
            - containerPort: 27017
      restartPolicy: Always
      volumes:
      - name: my-project-db-claim0
        persistentVolumeClaim:
          claimName: my-project-db-claim0
      # - name: my-project-db-claim1
      #   persistentVolumeClaim:
      #     claimName: my-project-db-claim1

该Yaml运作完美.卷的输出为:

That yaml works perfectly. The output for the volumes is:

$ kubectl describe pv

Name:            pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6
Labels:          failure-domain.beta.kubernetes.io/region=us-east-1
                failure-domain.beta.kubernetes.io/zone=us-east-1c
Annotations:     kubernetes.io/createdby: aws-ebs-dynamic-provisioner
                pv.kubernetes.io/bound-by-controller: yes
                pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    gp2
Status:          Bound
Claim:           my-project/my-project-db-claim0
Reclaim Policy:  Delete
Access Modes:    RWO
Capacity:        5Gi
Node Affinity:   <none>
Message:        
Source:
    Type:       AWSElasticBlockStore (a Persistent Disk resource in AWS)
    VolumeID:   aws://us-east-1c/vol-xxxxx
    FSType:     ext4
    Partition:  0
    ReadOnly:   false
Events:         <none>


Name:            pvc-308d8979-039e-11e9-b78d-0a68bcb24bc6
Labels:          failure-domain.beta.kubernetes.io/region=us-east-1
                failure-domain.beta.kubernetes.io/zone=us-east-1b
Annotations:     kubernetes.io/createdby: aws-ebs-dynamic-provisioner
                pv.kubernetes.io/bound-by-controller: yes
                pv.kubernetes.io/provisioned-by: kubernetes.io/aws-ebs
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    gp2
Status:          Bound
Claim:           my-project/my-project-db-claim1
Reclaim Policy:  Delete
Access Modes:    RWO
Capacity:        10Gi
Node Affinity:   <none>
Message:        
Source:
    Type:       AWSElasticBlockStore (a Persistent Disk resource in AWS)
    VolumeID:   aws://us-east-1b/vol-xxxxx
    FSType:     ext4
    Partition:  0
    ReadOnly:   false
Events:         <none>

和pod输出:

$ kubectl describe pods

Name:               my-project-db-7d48567b48-slncd
Namespace:          my-project
Priority:           0
PriorityClassName:  <none>
Node:               ip-192-168-212-194.ec2.internal/192.168.212.194
Start Time:         Wed, 19 Dec 2018 15:55:58 +0100
Labels:             name=my-db
                    pod-template-hash=3804123604
Annotations:        <none>
Status:             Running
IP:                 192.168.216.33
Controlled By:      ReplicaSet/my-project-db-7d48567b48
Containers:
  my-project-db-container:
    Container ID:   docker://cf8222f15e395b02805c628b6addde2d77de2245aed9406a48c7c6f4dccefd4e
    Image:          mongo
    Image ID:       docker-pullable://mongo@sha256:0823cc2000223420f88b20d5e19e6bc252fa328c30d8261070e4645b02183c6a
    Port:           27017/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Wed, 19 Dec 2018 15:56:42 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /my_dir from my-project-db-claim0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
  Type           Status
  Initialized    True
  Ready          True
  PodScheduled   True
Volumes:
  my-project-db-claim0:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  my-project-db-claim0
    ReadOnly:   false
  default-token-pf9ks:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-pf9ks
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                    From                                      Message
  ----     ------                  ----                   ----                                      -------
  Warning  FailedScheduling        7m22s (x5 over 7m23s)  default-scheduler                         pod has unbound PersistentVolumeClaims (repeated 2 times)
  Normal   Scheduled               7m21s                  default-scheduler                         Successfully assigned my-project/my-project-db-7d48567b48-slncd to ip-192-168-212-194.ec2.internal
  Normal   SuccessfulMountVolume   7m21s                  kubelet, ip-192-168-212-194.ec2.internal  MountVolume.SetUp succeeded for volume "default-token-pf9ks"
  Warning  FailedAttachVolume      7m13s (x5 over 7m21s)  attachdetach-controller                   AttachVolume.Attach failed for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6" : "Error attaching EBS volume \"vol-01a863d0aa7c7e342\"" to instance "i-0a7dafbbdfeabc50b" since volume is in "creating" state
  Normal   SuccessfulAttachVolume  7m1s                   attachdetach-controller                   AttachVolume.Attach succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
  Normal   SuccessfulMountVolume   6m48s                  kubelet, ip-192-168-212-194.ec2.internal  MountVolume.SetUp succeeded for volume "pvc-307b755a-039e-11e9-b78d-0a68bcb24bc6"
  Normal   Pulling                 6m48s                  kubelet, ip-192-168-212-194.ec2.internal  pulling image "mongo"
  Normal   Pulled                  6m39s                  kubelet, ip-192-168-212-194.ec2.internal  Successfully pulled image "mongo"
  Normal   Created                 6m38s                  kubelet, ip-192-168-212-194.ec2.internal  Created container
  Normal   Started                 6m37s                  kubelet, ip-192-168-212-194.ec2.internal  Started container

所有内容创建成功,没有任何问题.但是,如果我取消注释yaml中的行,因此将两个卷附加到db部署,则pv输出与之前的输出相同,但是pod停留在挂起状态,并显示以下输出:

Everything is created without any problems. But if I uncomment the lines in the yaml so two volumes are attached to the db deployment, the pv output is the same as earlier, but the pod gets stuck on pending with the following output:

$ kubectl describe pods

Name:               my-project-db-b8b8d8bcb-l64d7
Namespace:          my-project
Priority:           0
PriorityClassName:  <none>
Node:               <none>
Labels:             name=my-db
                    pod-template-hash=646484676
Annotations:        <none>
Status:             Pending
IP:                 
Controlled By:      ReplicaSet/my-project-db-b8b8d8bcb
Containers:
  my-project-db-container:
    Image:        mongo
    Port:         27017/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:
      /data/db from my-project-db-claim1 (rw)
      /my_dir from my-project-db-claim0 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-pf9ks (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  my-project-db-claim0:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  my-project-db-claim0
    ReadOnly:   false
  my-project-db-claim1:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  my-project-db-claim1
    ReadOnly:   false
  default-token-pf9ks:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-pf9ks
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  60s (x5 over 60s)  default-scheduler  pod has unbound PersistentVolumeClaims (repeated 2 times)
  Warning  FailedScheduling  2s (x16 over 59s)  default-scheduler  0/2 nodes are available: 2 node(s) had no available volume zone.

我已经阅读了以下两个问题:

I've already read these two issues:

动态卷配置在错误的可用区域中创建了EBS卷

可以在没有节点的可用性区域中创建EBS上的PersistentVolume(关闭)

但是我已经检查了卷是否在与集群节点实例相同的区域中创建.实际上,EKS默认情况下在us-east-1bus-east-1c区域中创建两个EBS,并且这些卷有效.由发布的Yaml创建的卷也在这些区域上.

But I already checked that the volumes are created in the same zones as the cluster nodes instances. In fact, EKS creates two EBS by default in us-east-1b and us-east-1c zones and those volumes works. The volumes created by the posted yaml are on those regions too.

推荐答案

请参阅此文章: https://kubernetes.io/blog/2018/10/11/topology-aware-volume-provisioning-in-kubernetes/

要点是您想更新存储类以包括:

The gist is that you want to update your storageclass to include:

volumeBindingMode: WaitForFirstConsumer

这将导致在计划了容器之后才创建PV.它为我解决了类似的问题.

This causes the PV to not be created until the pod is scheduled. It fixed a similar problem for me.

这篇关于附加新卷(EKS)时Kubernetes Pod处于挂起状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆