如何决定 ThreadPoolTask​​Executor 池和队列大小? [英] How to decide on the ThreadPoolTaskExecutor pools and queue sizes?

查看:163
本文介绍了如何决定 ThreadPoolTask​​Executor 池和队列大小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这可能是更一般的问题,关于如何决定线程池大小,但让我们在这种情况下使用 Spring ThreadPoolTask​​Executor.我对池核心和最大大小以及队列容量有以下配置.我已经阅读了所有这些配置的含义 - 有一个很好的答案 此处.

This is may be more general question, on how to decide on the thread pool size, but let's use the Spring ThreadPoolTaskExecutor for this case. I have the following configuration for the pool core and max size and the queue capacity. I've already read about what all these configurations mean - there is a good answer here.

    @SpringBootApplication
    @EnableAsync
    public class MySpringBootApp {

        public static void main(String[] args) {
            ApplicationContext ctx = SpringApplication.run(MySpringBootApp.class, args);
        }

        @Bean
        public TaskExecutor taskExecutor() {
            ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
            executor.setCorePoolSize(5);
            executor.setMaxPoolSize(10);
            executor.setQueueCapacity(25);
            return executor;
        }

    }

以上数字对我来说看起来是随机的,我想了解如何根据我的环境正确设置它们.我将概述我拥有的以下限制条件:

The above numbers look random to me and I want to understand how to set them up correctly based on my environment. I will outline the following constraints that I have:

  1. 该应用程序将在一个双核 CPU 盒上运行
  2. 执行者将处理通常需要 1-2 次的任务几秒钟即可完成.
  3. 通常我希望将 800 个/分钟的任务提交给我的执行者,以 2500 个/分钟的速度飙升
  4. 该任务将构建一些对象并对 Google pubsub 进行 HTTP 调用.

理想情况下,我想了解我需要考虑哪些其他约束,并基于这些约束对我的池和队列大小进行合理配置.

Ideally I'd like to understand what other constraints I need to consider and based on them what will be a reasonable configuration for my pools and queue sizes.

推荐答案

更新:这个答案多年来获得了一些投票,所以我为那些没有时间阅读我奇怪比喻的人添加了一个缩短版本:

Update : This answer got a few votes over the years so I'm adding a shortened version for people who don't have the time to read my weird metaphor :

TL;DR 答案:

实际的限制是一个(逻辑)CPU 内核只能同时运行一个线程.因此:

The actual constraint is that a (logical) CPU core can only run a single thread at the same time. Thus :

  • 核心数:CPU 的逻辑核心数 * 1/(ratio_of_time_your_thread_is_runnable_when_doing_your_task)
  • Number of core : Number of logical core of your CPUs * 1/(ratio_of_time_your_thread_is_runnable_when_doing_your_task)

因此,如果您的机器上有 8 个逻辑核心,则可以安全地将 8 个线程放入您的 threadPool(嗯,请记住排除可能使用的其他线程).然后你需要问问自己是否可以放更多:你需要对你打算在线程池上运行的任务类型进行基准测试:如果你注意到线程平均只有 50% 的时间运行,这意味着你的 CPU 是有 50% 的时间可以自由地处理另一个线程,并且您可以添加更多线程.

So, if you have 8 logical cores on your machine, you can safely put 8 threads in your threadPool (well, remember to exclude the other threads that may be used). Then you need to ask yourself if you can put more : you need to benchmark the kind of task you intend to run on your threadpool : if you notice the thread are, on average running only 50% of the time, that means your CPU is free to go work on another thread 50% of its time and you can add more threads.

  • 队列大小:尽可能多地等待.
  • Queue size : as many as you can wait on.

队列大小是您的线程池在拒绝它们之前将接受的项目数.这是业务逻辑.这取决于您期望的行为:接受十亿个任务是否有意义?你什么时候扔毛巾?如果一项任务需要一秒钟才能完成,而您有 10 个线程,则意味着队列中的第 10,000 个任务有望在 1000 秒内完成.可以接受吗?最糟糕的事情是让客户端超时并在您完成第一个任务之前重新提交相同的任务.

The queue size is the number of items your threadPool will accept before rejecting them. It is business logic. It depends on what behavior you expect : is there a point accepting a billion tasks ? When do you throw the towel ? If one task takes one second to complete, and you have 10 threads, that means that the 10,000th task in queue will hopefully be done in 1000 seconds. Is that acceptable ? The worst thing to happen is having clients timeout and re-submit the same tasks before you could complete the firsts.

原始 ELI12 答案:

这可能不是最准确的答案,但我会尝试:

It may not be the most accurate answer, but I'll try :

一个简单的方法是注意您的 2 核 CPU 只能同时在两个线程上工作.

A simple approach is to be aware that your 2-core CPU will only work on two threads at the same time.

如果您拥有相对现代的 Intel CPU,并且您拥有超线程(又名.HT(TM)、HTT(TM),SMT) 打开(通过 BIOS 中的设置),您的操作系统将看到可用内核数量是 CPU 中物理内核数量的两倍.

If you have relatively modern Intel CPU, and you have Hyper Threading (aka. HT(TM), HTT(TM), SMT) turned on (via setting in BIOS), your operating system will see the number of available cores as double the number of the physical cores within your CPU.

无论哪种方式,从 Java 检测您可以使用多少个内核(或同时不抢占其他线程),只需调用 int cores = Runtime.getRuntime().availableProcessors();

Either way, from Java to detect how many cores (or simultaneous not-preempting each other threads) you can work with, just call int cores = Runtime.getRuntime().availableProcessors();

如果您尝试将您的应用程序视为研讨会(实际的):

If you try to see your application as a Workshop (an actual one) :

  • 处理者将由一名员工代表.它是为产品增加价值的物理单位.
  • 一项任务就是一堆原材料(加上一些说明清单)
  • 您的线程是一张办公桌,员工可以在上面放置任务和工作.
  • 队列大小是将原材料运送到桌面的传送带的长度.

因此,您的问题变成了在员工人数不变的情况下,我如何选择多少办公桌以及我的传送带在工厂内可以使用多长时间?".

Thus, your question becomes "How can I choose how many desks and how long can my conveyor belt be inside my factory, given an unchanging number of employees ?".

对于多少桌子(线程)部分:

一名员工一次只能在一张办公桌前工作,而每张办公桌只能有一名员工.因此,基本设置是至少拥有与员工一样多的办公桌(以避免任何员工(处理器)被排除在外而无法工作.

An employee can only work at one desk at a time, and you can only have a single employee per desk. Thus, the basic setup would be to have at least as many desks as you have employees (to avoid having any employee (Processor) left out without any possibility to work.

但是,根据您的活动,您可以为每位员工提供更多的办公桌:

But, depending on your activity, you may afford more desks per employee :

如果您的员工需要不断地将邮件放入信封中,那么需要他们全神贯注的操作(在编程中:分类集合、创建对象、增加计数器),拥有更多办公桌也无济于事,甚至可能是有害的,因为您的员工必须有时更换办公桌(切换上下文,这需要一些时间),因此离开他们正在处理的一个,让另一个工作取得进展.

If your employees are expected to put mail inside enveloppes constantly, an operation that require their full attention (in programing : sorting collections, creating objects, incrementing counters), having more desks wouldn't help, and may even be detrimental because your employee would have to sometime change desk (switching context, which takes some time), thus leaving the one they were working on, to make work progress on the other.

但是,如果您的任务是制作陶器,并且依赖于您的员工等待粘土在烤箱中烹饪(了解访问外部资源,例如文件系统、网络服务等),您的员工可以在另一张桌子上制作粘土模型,稍后再回到第一个.

But, if your task is making pottery, and relies on your employee waiting for the clay to cook in an oven (understand getting access to external resource, such as a file system, a web service etc), your employee can afford to go model clay on another desk and get back to the first one later.

因此,只要您的任务的有效工作/等待比率(运行/等待)足够大,您就可以为每位员工提供更多的办公桌.办公桌的数量是您的员工在等待时间内可以完成多少任务.

Thus, you can afford more desks per employee as long as your task have a active work/waiting ratio (running/waiting) big enough. And the number of desks being how many tasks can your employee make progress on during the waiting time.

对于传送带(队列)尺寸部分:

For the conveyor belt (queue) size part :

队列大小表示在开始拒绝更多任务(通过抛出异常)之前您允许排队的项目数量,因此是您开始告诉好吧,我已经超额预订并获胜的阈值"永远无法遵守"

The queue size represents how many item you are allowing to be queued before starting to reject any more task (by throwing an exception), thus being the threshold at which you start to tell "ok, I'm already overbooked and won't ever be able to comply"

首先,我想说您的传送带需要适合车间内.这意味着集合应该足够小以防止内存不足错误(显然).

First, I'd say your conveyer belt needs to fit inside the workshop. Meaning that the collection should be small enough to prevent out of memory errors (obviously).

之后,它是基于您公司的政策.让我们假设每次客户下订单(另一个服务调用您的 API)时都会向传送带中添加一个任务.如果来电者不在乎你花多少时间来遵守和信任你,那么限制腰带的大小就没有意义了.

After that, it is based on your company policy. Let's assume a task is added to the belt every time a client makes an order (another service call your API). If the caller doesn't care how much time you take to comply and trust you enough with the execution, there's no point in limiting the size of the belt.

但是,如果您可以预期您的客户在等待他们的陶器一个月后会生气,并让您同时购买或重新订购另一件陶器,假设第一个订单丢失并且不会费心检查是否第一个订单完成了...第一个订单是白做的,你不会得到报酬,如果你的客户在你太慢而无法遵守时下另一个订单,你会进入一个反馈循环,因为每个新订单会减慢整个过程.

But if you can expect that your client gets annoyed after waiting for their pottery for a month, and leaves you for a concurrent or reorder another pottery, assuming the first order was lost and won't be bothered to ever check if the first order was completed... That first order was done for nothing, you won't get payed, and if your client makes another order whenever you're too slow to comply, you'll enter in a feedback loop because every new order will slow down the whole process.

因此,在这种情况下,您应该张贴一个标志告诉您的客户对不起,我们已经超额预订了,您现在不应该下任何新订单,因为我们无法在可接受的时间范围内遵守".

Thus, in that case, you should put up a sign telling your client "sorry, we're overbooked, you shouldn't make any new order now, as we won't be able to comply within an acceptable time range".

那么,队列大小将是:可接受的时间范围/完成任务的时间.

Then, the queue size would be : acceptable time range / time to complete a task.

具体示例:如果您的客户端服务期望它提交的任务必须在 100 秒内完成,并且知道每个任务需要 1-2 秒,那么您应该将队列限制为 50-100 个任务,因为一次您有 100 个任务在队列中等待,您很确定下一个不会在 100 秒内完成,从而拒绝该任务以防止服务无所等待.

Concrete Example : if your client service expects that the task it submits would have to be completed in less than 100 seconds, and knowing that every task takes 1-2 seconds, you should limit the queue to 50-100 tasks because once you have 100 tasks waiting in the queue, you're pretty sure that the next one won't be completed in less than 100 seconds, thus rejecting the task to prevent the service from waiting for nothing.

这篇关于如何决定 ThreadPoolTask​​Executor 池和队列大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆