启用自动缩放后,GKE不会从0缩放到/从0缩放 [英] GKE does not scale to/from 0 when autoscaling enabled
问题描述
我想在我的GKE上运行CronJob,以便每天执行批处理操作.理想的情况是,当作业未运行时,我的群集可以扩展到0个节点,每次满足计划时,都可以动态扩展到1个节点并在其上运行该作业.
I want to run a CronJob on my GKE in order to perform a batch operation on a daily basis. The ideal scenario would be for my cluster to scale to 0 nodes when the job is not running and to dynamically scale to 1 node and run the job on it every time the schedule is met.
I am first trying to achieve this by using a simple CronJob found in the kubernetes doc that only prints the current time and terminates.
我首先使用以下命令创建了集群:
I first created a cluster with the following command:
gcloud container clusters create $CLUSTER_NAME \
--enable-autoscaling \
--min-nodes 0 --max-nodes 1 --num-nodes 1 \
--zone $CLUSTER_ZONE
然后,我创建了一个具有以下描述的CronJob:
Then, I created a CronJob with the following description:
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: Never
该作业计划每小时运行一次,并在终止之前打印当前时间.
The job is scheduled to run every hour and to print the current time before terminating.
第一件事,我想创建具有0个节点的群集,但是设置--num-nodes 0
会导致错误.为什么会这样呢?请注意,创建群集后,我可以将群集手动缩小到0个节点.
First thing, I wanted to create the cluster with 0 nodes but setting --num-nodes 0
results in an error. Why is it so? Note that I can manually scale down the cluster to 0 nodes after it has been created.
第二,如果我的集群有0个节点,则不会安排作业,因为集群不会自动扩展到1个节点,而是出现以下错误:
Second, if my cluster has 0 nodes, the job won't be scheduled because the cluster does not scale to 1 node automatically but instead gives the following error:
无法安排广告连播:没有可用的节点来安排广告连播.
Cannot schedule pods: no nodes available to schedule pods.
第三,如果我的集群有1个节点,则作业可以正常运行,但是此后,集群将不会缩减到0个节点,而是保留1个节点.我让群集运行两个连续的作业,但在这两个作业之间没有缩减.我认为一个小时应该足够使群集能够这样做.
Third, if my cluster has 1 node, the job runs normally but after that, the cluster won't scale down to 0 nodes but stay with 1 node instead. I let my cluster run for two successive jobs and it did not scale down in between. I assume one hour should be long enough for the cluster to do so.
我想念什么?
I've got it to work and detailed my solution here.
推荐答案
更新:
注意:从Kubernetes 1.7版本开始,您可以指定一个最小值 您的节点池的大小为零.这使您的节点池可以扩展 如果不需要运行其中的实例,则完全关闭 工作量.
Note: Beginning with Kubernetes version 1.7, you can specify a minimum size of zero for your node pool. This allows your node pool to scale down completely if the instances within aren't required to run your workloads.
https://cloud.google.com/kubernetes-engine /docs/concepts/cluster-autoscaler
旧答案:
不支持将整个集群扩展到0,因为您总是需要至少一个节点用于系统Pod:
Scaling the entire cluster to 0 is not supported, because you always need at least one node for system pods:
您可以使用一台用于系统Pod的小型计算机创建一个节点池,并使用一台用于运行工作负载的大型计算机创建另一个节点池.这样,第二个节点池可以缩小到0,并且您仍然有运行系统Pod的空间.
You could create one node pool with a small machine for system pods, and an additional node pool with a big machine where you would run your workload. This way the second node pool can scale down to 0 and you still have space to run the system pods.
尝试之后,@ xEc提到:还请注意,在某些情况下,我的节点池无法扩展,例如如果我创建的池的初始大小为0而不是1.
After attempting, @xEc mentions: Also note that there are scenarios in which my node pool wouldn't scale, like if I created the pool with an initial size of 0 instead of 1.
初步建议:
也许您可以使用cron运行微型VM,以扩大集群规模,提交作业(而不是CronJob),等待其完成然后将其缩小为0?
Perhaps you could run a micro VM, with cron to scale the cluster up, submit a Job (instead of CronJob), wait for it to finish and then scale it back down to 0?
这篇关于启用自动缩放后,GKE不会从0缩放到/从0缩放的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!