石英性能 [英] Quartz Performance

查看:115
本文介绍了石英性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎Quartz调度程序每秒可以运行的作业数量有限制。在我们的场景中,我们每秒大约有20个作业以24x7的速度启动,而石英每秒可以运行多达10个作业(有100个石英线程和100个数据库连接池大小用于JDBC支持的JobStore),但是,当我们将它增加到20个每秒工作,石英变得非常慢,其触发的工作与他们的实际预定时间相比非常晚,造成许多失火并最终显着降低系统的整体性能。一个有趣的事实是, JobExecutionContext.getScheduledFireTime()。getTime()对于这样的延迟触发器来说,它们的计划时间为10-20甚至更多分钟。

It seems there is a limit on the number of jobs that Quartz scheduler can run per second. In our scenario we are having about 20 jobs per second firing up for 24x7 and quartz worked well upto 10 jobs per second (with 100 quartz threads and 100 database connection pool size for a JDBC backed JobStore), however, when we increased it to 20 jobs per second, quartz became very very slow and its triggered jobs are very late compared to their actual scheduled time causing many many Misfires and eventually slowing down the overall performance of the system significantly. One interesting fact is that JobExecutionContext.getScheduledFireTime().getTime() for such delayed triggers comes to be 10-20 and even more minutes after their schedule time.

石英调度程序每秒可运行多少个作业而不影响作业的预定时间,这种负载的最佳石英线数应该是多少?

How many jobs the quartz scheduler can run per second without affecting the scheduled time of the jobs and what should be the optimum number of quartz threads for such load?

或者我在这里遗漏了什么?

Or am I missing something here?

我们有近10k项目(分为2个或更多类别,在目前情况下我们有2个类别),我们需要在给定频率下进行一些处理,例如15,30,60 ...分钟,这些物品应在该频率内以每分钟给定的油门进行处理。例如让我们说60分钟频率每个类别的5k项目应该以每分钟500项的节流进行处理。因此,理想情况下,这些物品应在一天中每小时的前10(5000/500)分钟内处理,每分钟有500件待处理的物品,这些物品在每分钟的均匀分布均匀分布,因此我们将大约8-一个类别每秒9项。

We have almost 10k items (categorized among 2 or more categories, in current case we have 2 categories) on which we need to some processing at given frequency e.g. 15,30,60... minutes and these items should be processed within that frequency with a given throttle per minute. e.g. lets say for 60 minutes frequency 5k items for each category should be processed with a throttle of 500 items per minute. So, ideally these items should be processed within first 10 (5000/500) minutes of each hour of the day with each minute having 500 items to be processed which are distributed evenly across the each second of the minute so we would have around 8-9 items per second for one category.

现在为了达到这个目的,我们使用Quartz作为调度程序来触发处理这些项目的作业。但是,我们不会在Job.execute方法中处理每个项目,因为每个项目处理需要5-50秒(平均到30秒),这涉及webservice调用。我们宁愿为 JMS 队列上的每个项目处理推送消息,并且单独的服务器机器处理这些作业。我注意到Job.execute方法花费的时间不超过 30毫秒

Now for to achieve this we have used Quartz as scheduler which triggers jobs for processing these items. However, we don't process each item with in the Job.execute method because it would take 5-50 seconds (averaging to 30 seconds) per item processing which involves webservice call. We rather push a message for each item processing on JMS queue and separate server machines process those jobs. I have noticed the time being taken by the Job.execute method not to be more than 30 milliseconds.

Solaris Sparc 64位服务器,带有8/16核心/线程cpu,用于具有16GB RAM的调度程序,我们在调度程序集群中有两台这样的计算机。

Solaris Sparc 64 Bit server with 8/16 cores/threads cpu for scheduler with 16GB RAM and we have two such machines in the scheduler cluster.

推荐答案

在以前的项目中,我遇到了同样的问题。在我们的例子中,Quartz在一秒钟内表现出色。亚秒级调度是一个延伸,正如您所观察到的那样,经常发生失火并且系统变得不可靠。

In a previous project, I was confronted with the same problem. In our case, Quartz performed good up a granularity of a second. Sub-second scheduling was a stretch and as you are observing, misfires happened often and the system became unreliable.

通过创建2级调度解决了这个问题:Quartz会安排n个连续工作的工作。使用集群Quartz,这意味着系统中的给定服务器将使该作业设置以执行。然后,集合中的n个任务由微调度程序接收:基本上是一个使用本机JDK API进一步将作业计时到10ms粒度的计时工具。

Solved this issue by creating 2 levels of scheduling: Quartz would schedule a job 'set' of n consecutive jobs. With a clustered Quartz, this means that a given server in the system would get this job 'set' to execute. The n tasks in the set are then taken in by a "micro-scheduler": basically a timing facility that used the native JDK API to further time the jobs up to the 10ms granularity.

为了处理各个工作,我们使用了一个主工作者设计,其中主人负责将工作的预定交付(限制)工作到一个多线程的工人池。

To handle the individual jobs, we used a master-worker design, where the master was taking care of the scheduled delivery (throttling) of the jobs to a multi-threaded pool of workers.

如果我今天必须再次这样做,我会依赖 ScheduledThreadPoolExecutor 来管理微调度。对于您的情况,它看起来像这样:

If I had to do this again today, I'd rely on a ScheduledThreadPoolExecutor to manage the 'micro-scheduling'. For your case, it would look something like this:

ScheduledThreadPoolExecutor scheduledExecutor;
...
    scheduledExecutor = new ScheduledThreadPoolExecutor(THREAD_POOL_SIZE);
...

// Evenly spread the execution of a set of tasks over a period of time
public void schedule(Set<Task> taskSet, long timePeriod, TimeUnit timeUnit) {
    if (taskSet.isEmpty()) return; // or indicate some failure ...
    long period = TimeUnit.MILLISECOND.convert(timePeriod, timeUnit);
    long delay = period/taskSet.size();
    long accumulativeDelay = 0;
    for (Task task:taskSet) {
        scheduledExecutor.schedule(task, accumulativeDelay, TimeUnit.MILLISECOND);
        accumulativeDelay += delay;
    }
}

这使您可以大致了解如何使用JDK微观计划任务的设施。 (免责声明:您需要为prod环境提供强大功能,例如检查失败的任务,管理重试(如果支持)等等。)。

This gives you a general idea on how use the JDK facility to micro-schedule tasks. (Disclaimer: You need to make this robust for a prod environment, like check failing tasks, manage retries (if supported), etc...).

通过一些测试+调整,我们在Quartz作业和一个预定集中的作业数量之间找到了最佳平衡。

With some testing + tuning, we found an optimal balance between the Quartz jobs and the amount of jobs in one scheduled set.

我们通过这种方式实现了100倍的吞吐量提升。网络带宽是我们的实际限制。

We experienced a 100X throughput improvement in this way. Network bandwidth was our actual limit.

这篇关于石英性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆