kubernetes调度昂贵的资源 [英] kubernetes scheduling for expensive resources
问题描述
我们有一个Kubernetes集群.
We have a Kubernetes cluster.
现在,我们希望使用GPU节点来扩展它(这样,它将是Kubernetes集群中唯一具有GPU的节点).
Now we want to expand that with GPU nodes (so that would be the only nodes in the Kubernetes cluster that have GPUs).
我们希望避免Kubernetes在这些节点上调度Pod,除非它们需要GPU.
We'd like to avoid Kubernetes to schedule pods on those nodes unless they require GPUs.
并非我们所有的管道都可以使用GPU.绝对多数仍然只占用大量CPU.
Not all of our pipelines can use GPUs. The absolute majority are still CPU-heavy only.
带有GPU的服务器可能非常昂贵(例如,Nvidia DGX可能高达每台服务器$ 150/k).
The servers with GPUs could be very expensive (for example, Nvidia DGX could be as much as $150/k per server).
如果我们只是将DGX节点添加到Kubernetes集群中,那么Kubernetes也会在那里安排非GPU工作负载,这会浪费资源(例如,其他稍后安排的作业确实需要GPU,可能还有其他非GPU工作负载). GPU资源在那里耗尽,如CPU和内存,因此它们将不得不等待非GPU作业/容器完成).
If we just add DGX nodes to Kubernetes cluster, then Kubernetes would schedule non-GPU workloads there too, which would be a waste of resources (e.g. other jobs that are getting scheduled later and do need GPUs, may have other non-GPU resources there exhausted there like CPU and memory, so they would have to wait for non-GPU jobs/containers to finish).
是否有一种方法可以在Kubernetes中自定义GPU资源调度,以便仅在需要GPU的那些昂贵节点上调度Pod?如果没有,他们可能必须等待其他非GPU资源的可用性,例如非GPU服务器上的CPU和内存...
Is there is a way to customize GPU resource scheduling in Kubernetes so that it would only schedule pods on those expensive nodes if they require GPUs? If they don't, they may have to wait for availability of other non-GPU resources like CPU and memory on non-GPU servers...
谢谢.
推荐答案
为节点使用标签和标签选择器是正确的.但是您需要使用 NodeAffinity
在您的豆荚上.
Using labels and label selectors for your nodes is right. But you need to use NodeAffinity
on your pods.
类似这样的东西:
apiVersion: v1
kind: Pod
metadata:
name: run-with-gpu
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/node-type
operator: In
values:
- gpu
containers:
- name: your-gpu-workload
image: mygpuimage
此外,将标签粘贴到您的GPU节点上:
Also, attach the label to your GPU nodes:
$ kubectl label nodes <node-name> kubernetes.io/node-type=gpu
这篇关于kubernetes调度昂贵的资源的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!