Kubernetes-处理无限数量的工作项 [英] Kubernetes - processing an unlimited number of work-items

查看:62
本文介绍了Kubernetes-处理无限数量的工作项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要从工作队列中获取一个工作项目,然后依次运行一系列容器来处理每个工作项目.可以使用initContainers( https://stackoverflow.com/a/46880653/94078 )

I need to get a work-item from a work-queue and then sequentially run a series of containers to process each work-item. This can be done using initContainers (https://stackoverflow.com/a/46880653/94078)

重新启动过程以获取下一个工作项目的推荐方法是什么?

What would be the recommended way of restarting the process to get the next work-item?

  • Jobs seem ideal but don't seem to support an infinite/indefinite number of completions.
  • Using a single Pod doesn't work because initContainers aren't restarted (https://github.com/kubernetes/kubernetes/issues/52345).
  • I would prefer to avoid the maintenance/learning overhead of a system like argo or brigade.

谢谢!

推荐答案

工作应该用于处理工作队列.使用工作队列时,您应该不设置 .spec.comletions (或将其设置为null).在这种情况下,Pods将继续创建,直到其中一个Pod成功退出为止.从(主)容器退出时有一个尴尬的状态,它故意处于故障状态,但这是规范.您可以根据自己的喜好设置.spec.parallelism,而无需考虑此设置.我已将其设置为1,因为您似乎不需要任何并行性.

Jobs should be used for working with work queues. When using work queues you should not set the .spec.comletions (or set it to null). In that case Pods will keep getting created until one of the Pods exit successfully. It is a little awkward exiting from the (main) container with a failure state on purpose, but this is the specification. You may set .spec.parallelism to your liking irrespective of this setting; I've set it to 1 as it appears you do not want any parallelism.

在您的问题中,您没有指定工作队列为空时要执行的操作,因此,我将提供两种解决方案,一种是要等待新项目(无限),另一种是要结束工作,如果工作队列将为空(数量有限,但数量不确定).

In your question you did not specify what you want to do if the work queue gets empty, so I will give two solutions, one if you want to wait for new items (infinite) and one if want to end the job if the work queue gets empty (finite, but indefinite number of items).

两个示例都使用redis,但是您可以将此模式应用于自己喜欢的队列.请注意,从队列中弹出项目的部分并不安全;如果您的Pod在弹出某个项目后死于某种原因,则该项目将保持未处理状态或未完全处理.有关适当的解决方案,请参见可靠队列模式.

Both examples use redis, but you can apply this pattern to your favorite queue. Note that the part that pops an item from the queue is not safe; if your Pod dies for some reason after having popped an item, that item will remain unprocessed or not fully processed. See the reliable-queue pattern for a proper solution.

要在我使用过的每个工作项上执行顺序步骤,请初始化容器.请注意,这确实是一种原始的解决方案,但是如果您不想使用某些框架来实现适当的管道,则选择的范围有限.

To implement the sequential steps on each work item I've used init containers. Note that this really is a primitve solution, but you have limited options if you don't want to use some framework to implement a proper pipeline.

有一个 asciinema ,如果有人希望在不部署Redis的情况下正常工作,等等

There is an asciinema if any would like to see this at work without deploying redis, etc.

Redis

要对此进行测试,您至少需要创建一个redis Pod和一个Service.我正在使用精简并行处理工作队列.您可以使用以下方法进行部署:

Redis

To test this you'll need to create, at a minimum, a redis Pod and a Service. I am using the example from fine parallel processing work queue. You can deploy that with:

kubectl apply -f https://rawgit.com/kubernetes/website/master/docs/tasks/job/fine-parallel-processing-work-queue/redis-pod.yaml
kubectl apply -f https://rawgit.com/kubernetes/website/master/docs/tasks/job/fine-parallel-processing-work-queue/redis-service.yaml

此解决方案的其余部分期望您在与Job相同的名称空间中拥有服务名称redis,并且不需要身份验证,并且需要一个名为redis-master的Pod.

The rest of this solution expects that you have a service name redis in the same namespace as your Job and it does not require authentication and a Pod called redis-master.

要在工作队列中插入一些项目,请使用以下命令(您需要bash才能使其工作):

To insert some items in the work queue use this command (you will need bash for this to work):

echo -ne "rpush job "{1..10}"\n" | kubectl exec -it redis-master -- redis-cli

无限版本

此版本等待队列为空,因此它将永远不会完成.

Infinite version

This version waits if the queue is empty thus it will never complete.

apiVersion: batch/v1
kind: Job
metadata:
  name: primitive-pipeline-infinite
spec:
  parallelism: 1
  completions: null
  template:
    metadata:
      name: primitive-pipeline-infinite
    spec:
      volumes: [{name: shared, emptyDir: {}}]
      initContainers:
      - name: pop-from-queue-unsafe
        image: redis
        command: ["sh","-c","redis-cli -h redis blpop job 0 >/shared/item.txt"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-1
        image: busybox
        command: ["sh","-c","echo step-1 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-2
        image: busybox
        command: ["sh","-c","echo step-2 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-3
        image: busybox
        command: ["sh","-c","echo step-3 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      containers:
      - name: done
        image: busybox
        command: ["sh","-c","echo all done with `cat /shared/item.txt`; sleep 1; exit 1"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      restartPolicy: Never

有限版本

如果队列为空,此版本将停止作业.请注意pop初始化容器检查队列是否为空并且所有随后的初始化容器如果确实为空则立即退出的技巧-这是一种向Kubernetes发出信号的机制完成,无需为其创建新的Pod.

Finite version

This version stops the job if the queue is empty. Note the trick that the pop init container checks if the queue is empty and all the subsequent init containers and the main container immediately exit if it is indeed empty - this is the mechanism that signals Kubernetes that the Job is completed and there is no need to create new Pods for it.

apiVersion: batch/v1
kind: Job
metadata:
  name: primitive-pipeline-finite
spec:
  parallelism: 1
  completions: null
  template:
    metadata:
      name: primitive-pipeline-finite
    spec:
      volumes: [{name: shared, emptyDir: {}}]
      initContainers:
      - name: pop-from-queue-unsafe
        image: redis
        command: ["sh","-c","redis-cli -h redis lpop job >/shared/item.txt; grep -q . /shared/item.txt || :>/shared/done.txt"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-1
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-1 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-2
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-2 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      - name: step-3
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo step-3 working on `cat /shared/item.txt` ...; sleep 5"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      containers:
      - name: done
        image: busybox
        command: ["sh","-c","[ -f /shared/done.txt ] && exit 0; echo all done with `cat /shared/item.txt`; sleep 1; exit 1"]
        volumeMounts: [{name: shared, mountPath: /shared}]
      restartPolicy: Never

这篇关于Kubernetes-处理无限数量的工作项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆