当Pod通常需要较低的CPU但需要定期扩展时,如何使用K8S HPA和自动缩放器 [英] How to use K8S HPA and autoscaler when Pods normally need low CPU but periodically scale

查看:92
本文介绍了当Pod通常需要较低的CPU但需要定期扩展时,如何使用K8S HPA和自动缩放器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试确定一种可与K8S一起使用的可靠设置,以使用HPA和自动缩放器来缩放我的部署之一.我想最大限度地减少过量使用的资源,但可以根据需要进行扩展.

I am trying to determine a reliable setup to use with K8S to scale one of my deployments using an HPA and an autoscaler. I want to minimize the amount of resources overcommitted but allow it to scale up as needed.

我有一个正在管理REST API服务的部署.大多数情况下,该服务的使用率非常低(0m-5m cpu).但是在一天或一周中,它会定期在5-10个CPU(5000m-10000m)的数量级上达到更高的使用率.

I have a deployment that is managing a REST API service. Most of the time the service will have very low usage (0m-5m cpu). But periodically through the day or week it will spike to much higher usage on the order of 5-10 CPUs (5000m-10000m).

我最初配置此文件的过程是:

My initial pass as configuring this is:

  • 部署:1个副本
"resources": {
   "requests": {
     "cpu": 0.05
   },
   "limits": {
      "cpu": 1.0
   }
}

  • HPA:
  • "spec": {
       "maxReplicas": 25,
       "metrics": [
          {
             "resource": {
             "name": "cpu",
             "target": {
                "averageValue": 0.75,
                "type": "AverageValue"
             }
             },
             "type": "Resource"
          }
       ],
       "minReplicas": 1,
       ...
    }
    

    此命令在运行自动缩放器的AWS EKS集群上运行.所有实例都有2个CPU.目的是随着CPU使用率的上升,HPA将分配一个不可调度的新容器,然后自动缩放器将分配一个新节点.当我在服务上增加负载时,第一个Pod的CPU使用率最多会飙升到大约90-95%.

    This is running on an AWS EKS cluster with autoscaler running. All instances have 2 CPUs. The goal is that as the CPU usage goes up the HPA will allocate a new pod that will be unschedulable and then the autoscaler will allocate a new node. As I add load on the service, the CPU usage for the first pod spikes up to approximately 90-95% at max.

    我遇到了两个相关问题:

    I am running into two related problems:

    1. 小请求大小

    通过使用如此小的请求值(cpu:0.05),即使当前节点处于高负载下,也可以轻松地在当前节点上调度新请求的Pod.因此,自动缩放器永远不会找到无法调度的Pod,也不会分配新节点.我可以增加较小的请求大小并过量使用,但这意味着在大多数时间没有负载的情况下,我将浪费我不需要的资源.

    By using such a small request value (cpu: 0.05), the newly requested pods can be easily scheduled on the current node even when it is under high load. Thus the autoscaler never find a pod that can't be scheduled and doesn't allocate a new node. I could increase the small request size and overcommit, but this then means that for the vast majority of the time when there is no load I will be wasting resources I don't need.

    1. 随着分配了更多的Pod,平均CPU减少了

    因为所有Pod都分配在同一节点上,所以一旦分配了新Pod,它便开始共享该节点的可用2个CPU.反过来,这会减少Pod使用的CPU数量,从而使平均值保持在75%的目标以下.

    Because the pods all get allocated on the same node, once a new pod is allocated it starts sharing the node's available 2 CPUs. This in turn reduces the amount of CPU used by the pod and thus keeps the average value below the 75% target.

    (例如:3个Pod,2个CPU ==>最大66%每个Pod的平均CPU使用率)

    (ex: 3 pods, 2 CPUs ==> max 66% Average CPU usage per pod)

    我正在这里寻找有关如何思考此问题的指导.我想我缺少一些简单的东西.

    I am looking for guidance here on how I should be thinking about this problem. I think I am missing something simple.

    我目前的想法是,我正在寻找一种方法,使Pod资源请求值在较重的负载下增加,然后在系统不需要它时减少回落.这将使我倾向于使用VPA之类的东西,但是我所读到的一切都表明,同时使用HPA和VPA会导致非常不好的事情.

    My current thought is that what I am looking for is a way for the Pod resource request value to increase under heavier load and then decrease back down when the system doesn't need it. That would point me toward using something like a VPA, but everything I have read says that using HPA and VPA at the same time leads to very bad things.

    我认为将请求从0.05增加到类似0.20可能会让我处理扩大规模的情况.但这反过来会浪费大量资源,并且如果调度程序在现有容器上找到空间,则可能会遇到问题.我的示例是关于一项服务的,但是生产部署中还有许多其他服务.我不想让节点闲置着已提交的资源,但没有使用.

    I think increasing the request from 0.05 to something like 0.20 would probably let me handle the case of scaling up. But this will in turn waste a lot of resources and could suffer issues if the scheduler find space on an existing pod. My example is about one service but there are many more services in the production deployment. I don't want to have nodes sitting empty with committed resources but no usage.

    这里前进的最佳途径是什么?

    What is the best path forward here?

    推荐答案

    类似的声音,您需要一个计划程序,将实际的CPU使用率考虑在内.目前尚不支持.

    Sounds like you need a Scheduler that take actual CPU utilization into account. This is not supported yet.

    似乎正在使用此功能: 新的调度程序优先级,以实现平均平均负载和可用内存.

    There seem to be work on a this feature: KEP - Trimaran: Real Load Aware Scheduling using TargetLoadPackin plugin. Also see New scheduler priority for real load average and free memory.

    与此同时,如果CPU限制为1核心,并且节点在CPU高利用率下会自动缩放,那么听起来好像应该工作如果节点远大于Pod的CPU限制.例如.尝试使用具有4个核或更多核并且可能对Pod的 CPU请求稍大的节点?

    In the meanwhile, if the CPU limit is 1 Core, and the Nodes autoscale under high CPU utilization, it sounds like it should work if the nodes is substantially bigger than the CPU limits for the pods. E.g. try with nodes that has 4 Cores or more and possibly slightly larger CPU request for the Pod?

    这篇关于当Pod通常需要较低的CPU但需要定期扩展时,如何使用K8S HPA和自动缩放器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆