如何设计分布式作业调度程序? [英] How to design a distributed job scheduler?
问题描述
我想设计一个作业调度程序集群,其中包含用于执行cron作业调度的几台主机.例如,将需要run every 5 minutes
的作业提交给集群,集群应指出下一次要启动的主机,请确保:
I want to design a job scheduler cluster, which contains several hosts to do cron job scheduling. For example, a job which needs run every 5 minutes
is submitted to the cluster, the cluster should point out which host to fire next run, making sure:
- 容灾能力:如果并非所有主机都关闭,则应该成功触发该作业.
- 有效期:仅一台主机可以触发下一次作业运行.
由于容灾,作业无法绑定到特定主机.一种方法是所有主机都轮询一个数据库表(一定是带锁的),这样可以保证只有一个主机可以运行下一个作业.由于它经常锁定表,有没有更好的设计?
Due to disaster tolerance, job cannot bind to a specific host. One way is all the hosts polling a DB table(certainly with lock), this guaranteed only one host gets the next job run. Since it often locks table, is there any better design?
推荐答案
我用Google搜索了Dkron(分布式作业调度系统).它具有休息API,看起来不错.我计划尝试使用它 Dkron网站
I googled out the Dkron (Distributed job scheduling system). It has rest api and looks good. I plan try to use it Dkron site
这篇关于如何设计分布式作业调度程序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!