如何使用HPA和Cluster Autoscaler整理GKE节点的资源利用率 [英] How to defrag resource utilization of GKE node with HPA and Cluster Autoscaler

查看:181
本文介绍了如何使用HPA和Cluster Autoscaler整理GKE节点的资源利用率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在GKE上使用HPA(水平Pod自动缩放器)和Cluster Autoscaler,可以按预期扩展Pod和节点.但是,当需求减少时,似乎会从随机节点中删除Pod.它导致利用率较低的节点.它不符合成本效益...

Using HPA (Horizontal Pod Autoscaler) and Cluster Autoscaler on GKE, pods and nodes are scaled up as expected. However, when demand decrease, pods are deleted from random nodes, it seems. It causes less utilized nodes. It is not cost effective...

HPA基于targetCPUUtilizationPercentage单个指标.不使用VPA.

HPA is based on targetCPUUtilizationPercentage single metrics. Not using VPA.

这是用于部署和HPA的简化的Yaml文件:

This is reducted yaml file for deployment and HPA:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: foo
spec:
  replicas: 1
  templates:
    spec:
      containers:
      - name: c1
        resources:                                                                                                             
          requests:                                                                                                            
            cpu: 200m                                                                                                          
            memory: 1.2G                                                                                                       
      - name: C2
        resources:                                                                                                             
          requests:                                                                                                            
            cpu: 10m                                                                                                           
        volumeMounts:                                                                                                          
        - name: log-share                                                                                                      
          mountPath: /mnt/log-share                                                                                            
      - name: C3
        resources:
          requests:
            cpu: 10m
          limits:
            cpu: 100m
        - name: log-share                                                                                                      
          mountPath: /mnt/log-share                                                                                            
      volumes:
      - name: log-share
        emptyDir: {}

---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: foo
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: foo
  minReplicas: 1
  maxReplicas: 60
  targetCPUUtilizationPercentage: 80
...

添加一个emptyDir卷作为有效示例.

Add an emptyDir volume to be valid example.

如何改善这种情况?

有一些想法,但是没有一个能完全解决问题...

There are some ideas, but none of them solve the issue completely...

  • 配置节点池计算机类型和Pod资源请求,以使一个节点上只能容纳一个Pod.如果HPA从节点上删除了Pod,则该节点将在一段时间后被删除,但不适用于各种资源请求的部署.
  • 尽可能使用抢占式节点...

推荐答案

对不起,我没有提到使用emptyDir(问题中的yaml).

Sorry, I failed to mention about use of emptyDir (edited yaml in the question).

当我自己对问题进行评论时,我发现

As I commented on the question myself, I found What types of pods can prevent CA from removing a node? in the Autoscaler FAQ.

具有本地存储的窗格. *

Pods with local storage. *

EmptyDir卷是本地存储,因此我需要在部署的Pod模板中添加以下注释,以标记该Pod可以安全地从使用较少的节点上逐出.

An emptyDir volume is a local storage, So I needed to add following annotation in the pod template of a deployment to mark the pod is safe to evict from less utilized nodes.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: foo
spec:
  selector:
    matchLabels:
      app: foo
  template:
    metadata:
      labels:
        app: foo
      annotations:
        cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
    spec:
      ...

指定注释后,GKE节点池的GCE实例组的大小小于以前.我认为它有效!

After specifying the annotation, the size of GCE instance group of the GKE node pool is smaller than before. I think it worked!

谢谢大家对问题的评论!

Thank you for everyone commented in the question!

这篇关于如何使用HPA和Cluster Autoscaler整理GKE节点的资源利用率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆