错误处理-在运行时更改作业配置 [英] Error handling - changing job configuration at runtime

查看:80
本文介绍了错误处理-在运行时更改作业配置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑到我有一个作业计划,该作业计划在某个重复间隔上创建作业,并且创建的作业与特定的预定义池相关联.但是,如果作业或池卡住了,我想将我的工作量/未来计划的作业移到新的 池,需要在运行时更改与作业关联的池信息.在Azure批处理中有可能吗?如果是:

-如何检测作业计划表无法按计划创建新作业?或如何检测过去的工作或我正在使用的存储卡被卡住?

-如果可能的话,如何在运行时更改与日程安排中的作业关联的池?

谢谢!

解决方案

在设置作业计划时,它会设置为您在配置期间选择的任何池.没有选择将其移动到另一个池中.但是,您可以在创建后修改作业计划,但这是一项手动任务.话虽如此,Azure批处理池 在Azure规模集上运行.规模集不断监控池中节点的运行状况.如果池中发生任何事情,我们已经内置了修复它的逻辑.例如,我们将自动重新启动出现问题的节点.如果没有 工作,然后我们将删除该节点,并用一个新节点替换它.因此,您的游泳池永远都不应被完全破坏.

我们还允许您为作业指定最长重试时间.因此,如果由于某种原因作业失败,您可以指定要重试该作业的次数.


Consider I have a job schedule that creates jobs on some recurrence interval, and the jobs created are associated with a specific pre-defined pool. But in the event where a job or a pool gets stuck, I want to move my workload/future scheduled jobs to a new pool, requiring to change the pool information associated with the jobs at run time. Is this possible in Azure Batch? If yes: 

- How can I detect that the job schedule is unable to create a new job on schedule? Or how can I detect that a past job or the pool I am using is stuck? 

- How can I change the pool associated with jobs in my schedule at runtime, if possible? 

Thanks!

解决方案

When you setup a job schedule it is setup to a whatever pool you pick during configuration. There is not an option to move it to another pool. You can however modify the job schedule after creation but it is a manual task. That being said, Azure batch pools run on Azure Scale Sets. Scale Sets constantly monitor the health of the nodes in the pool. If anything is to happen in the pool we have built in logic to fix it. For example, we will automatically restart the node that is having the issue. If that does not work, then we will remove that node and replace it with a new one. So your pool should never be completely broken. 

We also allow you to put in a max retry time on a job. So if for any reason a job was to fail you can specify how many times you would like the job to be retried. 


这篇关于错误处理-在运行时更改作业配置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆