吊舱在挂起状态下挂起 [英] pod hangs in Pending state

查看:113
本文介绍了吊舱在挂起状态下挂起的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个kubernetes部署,其中我试图在单个节点上的单个pod内运行5个docker容器.容器以挂起"状态挂起,从未调度.我不介意运行超过1个Pod,但我想减少节点数量.我假设1个CPU 1个节点和1.7G RAM的节点足以满足5个容器的需要,并且我试图将工作负载分配给所有用户.

I have a kubernetes deployment in which I am trying to run 5 docker containers inside a single pod on a single node. The containers hang in "Pending" state and are never scheduled. I do not mind running more than 1 pod but I'd like to keep the number of nodes down. I have assumed 1 node with 1 CPU and 1.7G RAM will be enough for the 5 containers and I have attempted to split the workload across.

最初,我得出的结论是我没有足够的资源.我启用了节点的自动缩放功能,从而产生了以下结果(请参见kubectl describe pod命令):

Initially I came to the conclusion that I have insufficient resources. I enabled autoscaling of nodes which produced the following (see kubectl describe pod command):

pod没有触发放大(如果添加了新节点,则不合适)

pod didn't trigger scale-up (it wouldn't fit if a new node is added)

无论如何,每个docker容器都有一个简单的命令,可运行一个相当简单的应用程序.理想情况下,我不想处理设置资源的CPU和RAM分配问题,但即使将CPU/内存限制设置在范围之内,这样它们的总和不会大于1,我仍然可以理解(请参阅kubectl描述po/test- 529945953-gh6cl)我明白了:

Anyway, each docker container has a simple command which runs a fairly simple app. Ideally I wouldn't like to have to deal with setting CPU and RAM allocation of resources but even setting the CPU/mem limits within bounds so they don't add up to > 1, I still get (see kubectl describe po/test-529945953-gh6cl) I get this:

没有可用节点与以下所有谓词匹配:: cpu不足(1),内存不足(1).

No nodes are available that match all of the following predicates:: Insufficient cpu (1), Insufficient memory (1).

下面是显示状态的各种命令.对于我在做错事情的任何帮助,将不胜感激.

Below are various commands that show the state. Any help on what I'm doing wrong will be appreciated.

kubectl全部获得

kubectl get all

user_s@testing-11111:~/gce$ kubectl get all
NAME                          READY     STATUS    RESTARTS   AGE
po/test-529945953-gh6cl   0/5       Pending   0          34m

NAME             CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
svc/kubernetes   10.7.240.1   <none>        443/TCP   19d

NAME              DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deploy/test   1         1         1            0           34m

NAME                    DESIRED   CURRENT   READY     AGE
rs/test-529945953   1         1         0         34m
user_s@testing-11111:~/gce$

kubectl描述po/test-529945953-gh6cl

kubectl describe po/test-529945953-gh6cl

user_s@testing-11111:~/gce$ kubectl describe po/test-529945953-gh6cl
Name:           test-529945953-gh6cl
Namespace:      default
Node:           <none>
Labels:         app=test
                pod-template-hash=529945953
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"test-529945953","uid":"c6e889cb-a2a0-11e7-ac18-42010a9a001a"...
Status:         Pending
IP:
Created By:     ReplicaSet/test-529945953
Controlled By:  ReplicaSet/test-529945953
Containers:
  container-test2-tickers:
    Image:      gcr.io/testing-11111/testology:latest
    Port:       <none>
    Command:
      process_cmd
      arg1
      test2
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:      100m
      memory:   375Mi
    Environment:
      DB_HOST:          127.0.0.1:5432
      DB_PASSWORD:      <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false
      DB_USER:          <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
  container-kraken-tickers:
    Image:      gcr.io/testing-11111/testology:latest
    Port:       <none>
    Command:
      process_cmd
      arg1
      arg2
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:      100m
      memory:   375Mi
    Environment:
      DB_HOST:          127.0.0.1:5432
      DB_PASSWORD:      <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false
      DB_USER:          <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
  container-gdax-tickers:
    Image:      gcr.io/testing-11111/testology:latest
    Port:       <none>
    Command:
      process_cmd
      arg1
      arg2
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:      100m
      memory:   375Mi
    Environment:
      DB_HOST:          127.0.0.1:5432
      DB_PASSWORD:      <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false
      DB_USER:          <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
  container-bittrex-tickers:
    Image:      gcr.io/testing-11111/testology:latest
    Port:       <none>
    Command:
      process_cmd
      arg1
      arg2
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:      100m
      memory:   375Mi
    Environment:
      DB_HOST:          127.0.0.1:5432
      DB_PASSWORD:      <set to the key 'password' in secret 'cloudsql-db-credentials'> Optional: false
      DB_USER:          <set to the key 'username' in secret 'cloudsql-db-credentials'> Optional: false
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
  cloudsql-proxy:
    Image:      gcr.io/cloudsql-docker/gce-proxy:1.09
    Port:       <none>
    Command:
      /cloud_sql_proxy
      --dir=/cloudsql
      -instances=testing-11111:europe-west2:testology=tcp:5432
      -credential_file=/secrets/cloudsql/credentials.json
    Limits:
      cpu:      150m
      memory:   375Mi
    Requests:
      cpu:              100m
      memory:           375Mi
    Environment:        <none>
    Mounts:
      /cloudsql from cloudsql (rw)
      /etc/ssl/certs from ssl-certs (rw)
      /secrets/cloudsql from cloudsql-instance-credentials (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-b2mxc (ro)
Conditions:
  Type          Status
  PodScheduled  False
Volumes:
  cloudsql-instance-credentials:
    Type:       Secret (a volume populated by a Secret)
    SecretName: cloudsql-instance-credentials
    Optional:   false
  ssl-certs:
    Type:       HostPath (bare host directory volume)
    Path:       /etc/ssl/certs
  cloudsql:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  default-token-b2mxc:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-b2mxc
    Optional:   false
QoS Class:      Burstable
Node-Selectors: <none>
Tolerations:    node.alpha.kubernetes.io/notReady:NoExecute for 300s
                node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
  FirstSeen     LastSeen        Count   From                    SubObjectPath   Type            Reason                  Message
  ---------     --------        -----   ----                    -------------   --------        ------                  -------
  27m           17m             44      default-scheduler                       Warning         FailedScheduling        No nodes are available that match all of the following predicates:: Insufficient cpu (1), Insufficient memory (2).
  26m           8s              150     cluster-autoscaler                      Normal          NotTriggerScaleUp       pod didn't trigger scale-up (it wouldn't fit if a new node is added)
  16m           2s              63      default-scheduler                       Warning         FailedScheduling        No nodes are available that match all of the following predicates:: Insufficient cpu (1), Insufficient memory (1).
user_s@testing-11111:~/gce$

> Blockquote

kubectl获取节点

kubectl get nodes

user_s@testing-11111:~/gce$ kubectl get nodes
NAME                                      STATUS    AGE       VERSION
gke-test-default-pool-abdf83f7-p4zw   Ready     9h        v1.6.7

kubectl获得豆荚

kubectl get pods

user_s@testing-11111:~/gce$ kubectl get pods
NAME                       READY     STATUS    RESTARTS   AGE
test-529945953-gh6cl   0/5       Pending   0          38m

kubectl描述节点

kubectl describe nodes

user_s@testing-11111:~/gce$ kubectl describe nodes
Name:                   gke-test-default-pool-abdf83f7-p4zw
Role:
Labels:                 beta.kubernetes.io/arch=amd64
                        beta.kubernetes.io/fluentd-ds-ready=true
                        beta.kubernetes.io/instance-type=g1-small
                        beta.kubernetes.io/os=linux
                        cloud.google.com/gke-nodepool=default-pool
                        failure-domain.beta.kubernetes.io/region=europe-west2
                        failure-domain.beta.kubernetes.io/zone=europe-west2-c
                        kubernetes.io/hostname=gke-test-default-pool-abdf83f7-p4zw
Annotations:            node.alpha.kubernetes.io/ttl=0
                        volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:                 <none>
CreationTimestamp:      Tue, 26 Sep 2017 02:05:45 +0100
Conditions:
  Type                  Status  LastHeartbeatTime                       LastTransitionTime                      Reason                          Message
  ----                  ------  -----------------                       ------------------                      ------                          -------
  NetworkUnavailable    False   Tue, 26 Sep 2017 02:06:05 +0100         Tue, 26 Sep 2017 02:06:05 +0100         RouteCreated                    RouteController created a route
  OutOfDisk             False   Tue, 26 Sep 2017 11:33:57 +0100         Tue, 26 Sep 2017 02:05:45 +0100         KubeletHasSufficientDisk        kubelet has sufficient disk space available
  MemoryPressure        False   Tue, 26 Sep 2017 11:33:57 +0100         Tue, 26 Sep 2017 02:05:45 +0100         KubeletHasSufficientMemory      kubelet has sufficient memory available
  DiskPressure          False   Tue, 26 Sep 2017 11:33:57 +0100         Tue, 26 Sep 2017 02:05:45 +0100         KubeletHasNoDiskPressure        kubelet has no disk pressure
  Ready                 True    Tue, 26 Sep 2017 11:33:57 +0100         Tue, 26 Sep 2017 02:06:05 +0100         KubeletReady                    kubelet is posting ready status. AppArmor enabled
  KernelDeadlock        False   Tue, 26 Sep 2017 11:33:12 +0100         Tue, 26 Sep 2017 02:05:45 +0100         KernelHasNoDeadlock             kernel has no deadlock
Addresses:
  InternalIP:   10.154.0.2
  ExternalIP:   35.197.217.1
  Hostname:     gke-test-default-pool-abdf83f7-p4zw
Capacity:
 cpu:           1
 memory:        1742968Ki
 pods:          110
Allocatable:
 cpu:           1
 memory:        1742968Ki
 pods:          110
System Info:
 Machine ID:                    e6119abf844c564193495c64fd9bd341
 System UUID:                   E6119ABF-844C-5641-9349-5C64FD9BD341
 Boot ID:                       1c2f2ea0-1f5b-4c90-9e14-d1d9d7b75221
 Kernel Version:                4.4.52+
 OS Image:                      Container-Optimized OS from Google
 Operating System:              linux
 Architecture:                  amd64
 Container Runtime Version:     docker://1.11.2
 Kubelet Version:               v1.6.7
 Kube-Proxy Version:            v1.6.7
PodCIDR:                        10.4.1.0/24
ExternalID:                     6073438913956157854
Non-terminated Pods:            (7 in total)
  Namespace                     Name                                                            CPU Requests    CPU Limits      Memory Requests Memory Limits
  ---------                     ----                                                            ------------    ----------      --------------- -------------
  kube-system                   fluentd-gcp-v2.0-k565g                                          100m (10%)      0 (0%)          200Mi (11%)     300Mi (17%)
  kube-system                   heapster-v1.3.0-3440173064-1ztvw                                138m (13%)      138m (13%)      301456Ki (17%)  301456Ki (17%)
  kube-system                   kube-dns-1829567597-gdz52                                       260m (26%)      0 (0%)          110Mi (6%)      170Mi (9%)
  kube-system                   kube-dns-autoscaler-2501648610-7q9dd                            20m (2%)        0 (0%)          10Mi (0%)       0 (0%)
  kube-system                   kube-proxy-gke-test-default-pool-abdf83f7-p4zw              100m (10%)      0 (0%)          0 (0%)          0 (0%)
  kube-system                   kubernetes-dashboard-490794276-25hmn                            100m (10%)      100m (10%)      50Mi (2%)       50Mi (2%)
  kube-system                   l7-default-backend-3574702981-flqck                             10m (1%)        10m (1%)        20Mi (1%)       20Mi (1%)
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits      Memory Requests Memory Limits
  ------------  ----------      --------------- -------------
  728m (72%)    248m (24%)      700816Ki (40%)  854416Ki (49%)
Events:         <none>

推荐答案

Allocated resources:kubectl describe nodes命令的输出中可以看到,Pod已经请求了728m (72%) CPU和700816Ki (40%)内存在节点上的kube-system名称空间中运行.正如您在kubectl describe po/[…]命令的Events下看到的那样,测试Pod的资源请求总数都超过了节点上可用的剩余CPU和内存.

As you can see in the output of your kubectl describe nodes command under Allocated resources:, there is 728m (72%) CPU and 700816Ki (40%) Memory already requested by Pods running in the kube-system namespace on the node. The sum of resource requests of your test Pod both exceeds the remaining CPU and Memory available on your node, as you can see under Events of your kubectl describe po/[…] command.

如果要将所有容器都放在一个容器中,则需要减少容器的资源请求或在具有更多CPU和内存的节点上运行它们.更好的解决方案是将您的应用程序拆分为多个Pod,这样可以在多个节点上进行分发.

If you want to keep all containers in a single pod, you need to reduce the resource requests of your containers or run them on a node with more CPU and Memory. The better solution would be to split your application in multiple pods, this enables distribution over multiple nodes.

这篇关于吊舱在挂起状态下挂起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆