为什么内存使用量大于我在Kubernetes节点中设置的内存使用量? [英] Why memory usage is greater than what I set in Kubernetes's node?

查看:995
本文介绍了为什么内存使用量大于我在Kubernetes节点中设置的内存使用量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我仅将650MB/30%的内存分配给1个Pod(使用其他内置Pod,限制内存仅为69%)

I allocated resource to 1 pod only with 650MB/30% of memory (with other built-in pods, limit memory is 69% only)

但是,在pod处理过程中,pod的使用在650MB以内,而节点的总体使用为94%.

However, when the pod handling process, the usage of pod is within 650MB but overall usage of node is 94%.

为什么会发生这种情况,因为它的上限应该为69%?是由于其他未设置限制的内置吊舱吗?如果内存使用率> 100%,该如何防止有时我的pod出现错误?

Why does it happen because it supposed to have upper limit of 69%? Is it due to other built-in pods which did not set limit? How to prevent this as sometimes my pod with error if usage of Memory > 100%?

我的分配设置(kubectl describe nodes):

My allocation setting (kubectl describe nodes):

Kubernetes节点和Pod空闲时的内存使用情况:
kubectl top nodes

kubectl top pods

Memory usage of Kubernetes Node and Pod when idle:
kubectl top nodes

kubectl top pods

运行任务时Kubernetes节点和Pod的内存使用情况:
kubectl top nodes

kubectl top pods

Memory usage of Kubernetes Node and Pod when running task:
kubectl top nodes

kubectl top pods

进一步测试的行为:
1.在名称空间 test-ns
下准备部署,pod和服务 2.由于只有 kube-system test-ns 具有Pod,所以应为每个Pod分配1000Mi(来自kubectl describe nodes),目标是小于2GB
3.假设在 kube-system test-ns 中使用的内存小于2GB,小于100%,为什么内存使用率可以为106%?

Further Tested behaviour:
1. Prepare deployment, pods and service under namespace test-ns
2. Since only kube-system and test-ns have pods, so assign 1000Mi to each of them (from kubectl describe nodes) aimed to less than 2GB
3. Suppose memory used in kube-system and test-ns will be less than 2GB which is less than 100%, why memory usage can be 106%?

.yaml文件中:

    apiVersion: v1
    kind: LimitRange
    metadata:
      name: default-mem-limit
      namespace: test-ns
    spec:
      limits:
      - default:
          memory: 1000Mi
        type: Container
    ---
    apiVersion: v1
    kind: LimitRange
    metadata:
      name: default-mem-limit
      namespace: kube-system
    spec:
      limits:
      - default:
          memory: 1000Mi
        type: Container
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: devops-deployment
      namespace: test-ns
      labels:
        app: devops-pdf
    spec:
      selector:
        matchLabels:
          app: devops-pdf
      replicas: 2
      template:
        metadata:
          labels:
            app: devops-pdf
        spec:
          containers:
          - name: devops-pdf
            image: dev.azurecr.io/devops-pdf:latest
            imagePullPolicy: Always
            ports:
            - containerPort: 3000
            resources:
              requests:
                cpu: 600m
                memory: 500Mi
              limits:
                cpu: 600m
                memory: 500Mi
          imagePullSecrets:
          - name: regcred
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: devops-pdf
      namespace: test-ns
    spec:
      type: LoadBalancer
      ports:
      - port: 8007
      selector:
        app: devops-pdf

推荐答案

此效果很可能是由在该节点上运行的4个Pod导致的,没有指定的内存限制,显示为0 (0%) .当然0并不意味着它甚至不能使用一个字节的内存,因为不使用内存就无法启动任何程序.相反,它意味着没有限制,它可以使用尽可能多的东西.另外,未在pod中运行的程序(ssh,cron等)也包括在使用的总数中,但不受kubernetes(受cgroups限制)的限制.

This effect is most likely caused by the 4 Pods that run on that node without a memory limit specified, shown as 0 (0%). Of course 0 doesn't mean it can't use even a single byte of memory as no program can be started without using memory; instead it means that there is no limit, it can use as much as available. Also programs running not in pod (ssh, cron, ...) are included in the total used figure, but are not limited by kubernetes (by cgroups).

现在kubernetes以一种棘手的方式设置内核oom调整值,以支持处于其内存 request 中的容器,从而使其更有可能杀死处于其内存之间的容器中的进程> request limit ,并使其最有可能杀死没有内存 limits 的容器中的进程.但是,从长远来看,这仅能正常工作,并且有时内核可以在行为良好的最喜欢的容器(使用少于其内存 request )的最喜欢的容器中杀死您最喜欢的进程.参见 https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#node-oom-behavior

Now kubernetes sets up the kernel oom adjustment values in a tricky way to favour containers that are under their memory request, making it more more likely to kill processes in containers that are between their memory request and limit, and making it most likely to kill processes in containers with no memory limits. However, this is only shown to work fairly in the long run, and sometimes the kernel can kill your favourite process in your favourite container that is behaving well (using less than its memory request). See https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#node-oom-behavior

在这种情况下,没有内存限制的Pod来自aks系统本身,因此不能选择在Pod模板中设置其内存限制,因为有一个协调器可以(最终)恢复它.为了解决这种情况,我建议您在kube-system命名空间中创建一个LimitRange对象,该对象将为所有没有限制的pod分配一个内存限制(创建时):

The pods without memory limit in this particular case are coming from the aks system itself, so setting their memory limit in the pod templates is not an option as there is a reconciler that will restore it (eventually). To remedy the situation I suggest that you create a LimitRange object in the kube-system namespace that will assign a memory limit to all pods without a limit (as they are created):

apiVersion: v1
kind: LimitRange
metadata:
  name: default-mem-limit
  namespace: kube-system
spec:
  limits:
  - default:
      memory: 150Mi
    type: Container

(您需要删除没有内存限制的现有 Pods 才能生效;它们将被重新创建)

(You will need to delete the already existing Pods without a memory limit for this to take effect; they will get recreated)

这并不能完全消除问题,因为您可能会遇到过量使用节点的情况.但是内存使用将很有意义,并且oom事件将更加可预测.

This is not going to completely eliminate the problem as you might end up with an overcommitted node; however the memory usage will make sense and the oom events will be more predictable.

这篇关于为什么内存使用量大于我在Kubernetes节点中设置的内存使用量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆