Kubernetes Jobs中的Sidecar容器? [英] Sidecar containers in Kubernetes Jobs?
问题描述
我们在这里使用Kubernetes Job
进行大量批处理计算,我想为每个Job配备一个监控辅助工具,以随着工作的进展更新集中式跟踪系统.
We use Kubernetes Job
s for a lot of batch computing here and I'd like to instrument each Job with a monitoring sidecar to update a centralized tracking system with the progress of a job.
唯一的问题是,我无法弄清作业中多个容器的语义是(或应该是).
The only problem is, I can't figure out what the semantics are (or are supposed to be) of multiple containers in a job.
无论如何,我还是给了它一个镜头(每隔1秒钟就有一个alpine
边车打印"hello"),在我的主要任务完成后,在Kubernetes 1.2中将Job
视为Successful
并将其视为kubectl get pods
.0显示:
I gave it a shot anyways (with an alpine
sidecar that printed "hello" every 1 sec) and after my main task completed, the Job
s are considered Successful
and the kubectl get pods
in Kubernetes 1.2.0 shows:
NAME READY STATUS RESTARTS AGE
job-69541b2b2c0189ba82529830fe6064bd-ddt2b 1/2 Completed 0 4m
job-c53e78aee371403fe5d479ef69485a3d-4qtli 1/2 Completed 0 4m
job-df9a48b2fc89c75d50b298a43ca2c8d3-9r0te 1/2 Completed 0 4m
job-e98fb7df5e78fc3ccd5add85f8825471-eghtw 1/2 Completed 0 4m
如果我描述其中一个豆荚
And if I describe one of those pods
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 24 Mar 2016 11:59:19 -0700
Finished: Thu, 24 Mar 2016 11:59:21 -0700
然后GET
作业的Yaml显示每个容器的信息:
Then GET
ing the yaml of the job shows information per container:
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2016-03-24T18:59:29Z
message: 'containers with unready status: [pod-template]'
reason: ContainersNotReady
status: "False"
type: Ready
containerStatuses:
- containerID: docker://333709ca66462b0e41f42f297fa36261aa81fc099741e425b7192fa7ef733937
image: luigi-reduce:0.2
imageID: docker://sha256:5a5e15390ef8e89a450dac7f85a9821fb86a33b1b7daeab9f116be252424db70
lastState: {}
name: pod-template
ready: false
restartCount: 0
state:
terminated:
containerID: docker://333709ca66462b0e41f42f297fa36261aa81fc099741e425b7192fa7ef733937
exitCode: 0
finishedAt: 2016-03-24T18:59:30Z
reason: Completed
startedAt: 2016-03-24T18:59:29Z
- containerID: docker://3d2b51436e435e0b887af92c420d175fafbeb8441753e378eb77d009a38b7e1e
image: alpine
imageID: docker://sha256:70c557e50ed630deed07cbb0dc4d28aa0f2a485cf7af124cc48f06bce83f784b
lastState: {}
name: sidecar
ready: true
restartCount: 0
state:
running:
startedAt: 2016-03-24T18:59:31Z
hostIP: 10.2.113.74
phase: Running
因此,看来我的挎斗车需要监视主过程(如何进行),并在检测到它单独位于吊舱中后优雅地退出?如果这是正确的,那么是否有最佳实践/模式(侧车是否应退出并显示主容器的返回代码?但是它如何获得此信息?)?
So it looks like my sidecar would need to watch the main process (how?) and exit gracefully once it detects it is alone in the pod? If this is correct, then are there best practices/patterns for this (should the sidecar exit with the return code of the main container? but how does it get that?)?
**更新** 经过进一步的实验,我还发现了以下内容: 如果一个容器中有两个容器,则直到该容器中的所有容器都返回退出代码0才被认为是成功的.
** Update ** After further experimentation, I've also discovered the following: If there are two containers in a pod, then it is not considered successful until all containers in the pod return with exit code 0.
此外,如果在Pod规范上设置了restartPolicy: OnFailure
,则该Pod中任何以非零退出代码终止的容器都将在同一Pod中重新启动(这对于监视边车进行计数非常有用)重试次数,并在一定数量后删除作业(以解决Kubernetes作业中当前没有最大重试次数的问题).
Additionally, if restartPolicy: OnFailure
is set on the pod spec, then any container in the pod that terminates with non-zero exit code will be restarted in the same pod (this could be useful for a monitoring sidecar to count the number of retries and delete the job after a certain number (to workaround no max-retries currently available in Kubernetes jobs)).
推荐答案
You can use the downward api to figure out your own podname from within the sidecar, and then retrieving your own pod from the apiserver to lookup exist status. Let me know how this goes.
这篇关于Kubernetes Jobs中的Sidecar容器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!