尝试在AWS Fargate上安排Pod时,其停留在Pending状态 [英] Pod stuck in Pending state when trying to schedule it on AWS Fargate

查看:156
本文介绍了尝试在AWS Fargate上安排Pod时,其停留在Pending状态的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个EKS集群,我添加了支持以在混合模式下工作(换句话说,我已经向其中添加了Fargate配置文件).我的意图是仅在AWS Fargate上运行特定的工作负载,同时将EKS辅助节点保留用于其他类型的工作负载.

I have an EKS cluster to which I've added support to work in hybrid mode (in other words, I've added Fargate profile to it). My intention is to run only specific workload on the AWS Fargate while keeping the EKS worker nodes for other kind of workload.

要对此进行测试,我的Fargate个人资料定义为:

To test this out, my Fargate profile is defined to be:

  • 仅限于特定的名称空间(例如: mynamespace )
  • 具有特定标签,以便吊舱需要与之匹配才能在Fargate上进行安排(标签为: fargate:myvalue )
  • Restricted to specific namespace (Let's say: mynamespace)
  • Has specific label so that pods need to match it in order to be scheduled on Fargate (Label is: fargate: myvalue)

为了测试k8s资源,我正在尝试部署如下所示的简单nginx部署:

For testing k8s resources, I'm trying to deploy simple nginx deployment which looks like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: mynamespace
  labels:
    fargate: myvalue
spec:
  selector:
    matchLabels:
      app: nginx
      version: 1.7.9
      fargate: myvalue
  replicas: 1
  template:
    metadata:
      labels:
        app: nginx
        version: 1.7.9
        fargate: myvalue
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80

当我尝试应用此资源时,会得到以下提示:

When I try to apply this resource, I get following:

$ kubectl get pods -n mynamespace -o wide
NAME                                                        READY   STATUS      RESTARTS   AGE     IP            NODE                          NOMINATED NODE                                READINESS GATES
nginx-deployment-596c594988-x9s6n                           0/1     Pending     0          10m     <none>        <none>                        07c651ad2b-7cf85d41b2424e529247def8bda7bf38   <none>

Pod保持待处理状态,并且从未计划到AWS Fargate实例.

Pod stays in the Pending state and it is never scheduled to the AWS Fargate instances.

这是一个pod description输出:

This is a pod describe output:

$ kubectl describe pod nginx-deployment-596c594988-x9s6n -n mynamespace
Name:               nginx-deployment-596c594988-x9s6n
Namespace:          mynamespace
Priority:           2000001000
PriorityClassName:  system-node-critical
Node:               <none>
Labels:             app=nginx
                    eks.amazonaws.com/fargate-profile=myprofile
                    fargate=myvalue
                    pod-template-hash=596c594988
                    version=1.7.9
Annotations:        kubernetes.io/psp: eks.privileged
Status:             Pending
IP:
Controlled By:      ReplicaSet/nginx-deployment-596c594988
NominatedNodeName:  9e418415bf-8259a43075714eb3ab77b08049d950a8
Containers:
  nginx:
    Image:        nginx:1.7.9
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-784d2 (ro)
Volumes:
  default-token-784d2:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-784d2
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

我可以从此输出中得出的结论是,选择了正确的Fargate配置文件:

One thing that I can conclude from this output is that correct Fargate profile was chosen:

eks.amazonaws.com/fargate-profile=myprofile

此外,我看到某些值已添加到NOMINATED NODE字段中,但不确定其表示什么.

Also, I see that some value is added to NOMINATED NODE field but not sure what it represents.

在这种情况下,是否有任何想法或通常出现的问题值得我们进行故障排除?谢谢

Any ideas or usual problems that happen and that might be worth troubleshooting in this case? Thanks

推荐答案

事实证明,问题始终出在与Fargate配置文件关联的专用子网的网络设置中.

It turns out the problem was in networking setup of private subnets associated with the Fargate profile all the time.

要提供更多信息,这是我最初拥有的内容:

To give more info, here is what I initially had:

  1. 带有多个工作程序节点的EKS集群,我仅将公共子网分配给EKS集群本身
  2. 当我尝试将Fargate配置文件添加到EKS群集时,由于当前对Fargate的限制,无法将配置文件与公共子网关联.为了解决这个问题,我创建了具有与公共子网相同标签的私有子网,以使EKS群集能够识别它们.
  3. 我忘记的是,我需要启用从vpc专用子网到外部世界的连接(我缺少NAT网关).因此,我在与EKS关联的公共子网中创建了NAT网关,并在其关联的路由表中将如下所示的其他条目添加到了私有子网中:

  1. EKS cluster with several worker nodes where I've assigned only public subnets to the EKS cluster itself
  2. When I tried to add Fargate profile to the EKS cluster, because of the current limitation on Fargate, it is not possible to associate profile with public subnets. In order to solve this, I've created private subnets with the same tag like the public ones so that EKS cluster is aware of them
  3. What I forgot was that I needed to enable connectivity from the vpc private subnets to the outside world (I was missing NAT gateway). So I've created NAT gateway in Public subnet that is associated with EKS and added to the private subnets additional entry in their associated Routing table that looks like this:

0.0.0.0/0 nat-xxxxxxxx

0.0.0.0/0 nat-xxxxxxxx

这解决了我上面遇到的问题,尽管我不确定AWS Fargate配置文件仅需要与私有子网相关联的真正原因.

This solved the problem that I had above although I'm not sure about the real reason why AWS Fargate profile needs to be associated only with private subnets.

这篇关于尝试在AWS Fargate上安排Pod时,其停留在Pending状态的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆