我应该在Java程序中使用多少个线程? [英] How many threads should I use in my Java program?

查看:113
本文介绍了我应该在Java程序中使用多少个线程?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近继承了一个小型Java程序,它从大型数据库中获取信息,进行一些处理并生成有关信息的详细图像。原作者使用单个线程编写代码,然后修改它以允许它使用多个线程。

I recently inherited a small Java program that takes information from a large database, does some processing and produces a detailed image regarding the information. The original author wrote the code using a single thread, then later modified it to allow it to use multiple threads.

在代码中他定义了一个常量;

In the code he defines a constant;

//  number of threads
public static final int THREADS =  Runtime.getRuntime().availableProcessors();

然后设置用于创建图像的线程数。

Which then sets the number of threads that are used to create the image.

我理解他的理由是线程数不能大于可用处理器的数量,因此请将其设置为从处理器中获取全部潜力的数量。它是否正确?或者是否有更好的方法来充分发挥处理器的潜力?

I understand his reasoning that the number of threads cannot be greater than the number of available processors, so set it the the amount to get the full potential out of the processor(s). Is this correct? or is there a better way to utilize the full potential of the processor(s)?

编辑:为了进一步澄清,正在进行线程化的特定算法可以扩展到正在创建的图片的分辨率(每个像素1个线程)。这显然不是最好的解决方案。该算法所做的工作是一直需要的,并且是完全数学运算,没有锁或其他因素会导致任何给定的线程休眠。我只想最大化程序CPU利用率以减少完成时间。

To give some more clarification, The specific algorithm that is being threaded scales to the resolution of the picture being created, (1 thread per pixel). That is obviously not the best solution though. The work that this algorithm does is what takes all the time, and is wholly mathematical operations, there are no locks or other factors that will cause any given thread to sleep. I just want to maximize the programs CPU utilization to decrease the time to completion.

推荐答案

线程很好,但正如其他人所说的那样,你必须高度意识到你的瓶颈。您的算法听起来很容易受到多个CPU之间的缓存争用的影响 - 这尤其令人讨厌,因为它有可能达到所有线程的性能(通常您会想到使用多个线程继续处理,同时等待慢或高延迟IO操作)。

Threads are fine, but as others have noted, you have to be highly aware of your bottlenecks. Your algorithm sounds like it would be susceptible to cache contention between multiple CPUs - this is particularly nasty because it has the potential to hit the performance of all of your threads (normally you think of using multiple threads to continue processing while waiting for slow or high latency IO operations).

缓存争用是使用多CPU处理高度并行化算法的一个非常重要的方面:确保考虑到内存利用率。如果您可以构造数据对象,以便每个线程都有自己正在处理的内存,则可以大大减少CPU之间的缓存争用。例如,拥有大量的int并使不同的线程处理该阵列的不同部分可能更容易 - 但在Java中,对该阵列的边界检查将尝试访问内存中的相同地址,可能导致给定的CPU必须从L2或L3缓存重新加载数据。

Cache contention is a very important aspect of using multi CPUs to process a highly parallelized algorithm: Make sure that you take your memory utilization into account. If you can construct your data objects so each thread has it's own memory that it is working on, you can greatly reduce cache contention between the CPUs. For example, it may be easier to have a big array of ints and have different threads working on different parts of that array - but in Java, the bounds checks on that array are going to be trying to access the same address in memory, which can cause a given CPU to have to reload data from L2 or L3 cache.

将数据拆分为自己的数据结构,并配置这些数据结构,使它们是线程本地的(甚至可能更适合使用 ThreadLocal - 它实际上使用操作系统中的结构,提供CPU可以用来优化缓存的保证。

Splitting the data into it's own data structures, and configure those data structures so they are thread local (might even be more optimal to use ThreadLocal - that actually uses constructs in the OS that provide guarantees that the CPU can use to optimize cache.

我能给你的最好的建议是测试,测试,测试。不要假设CPU将如何执行 - 这些天CPU中存在巨大的大量魔法,通常具有违反直觉的结果。另请注意,JIT运行时优化将添加一个另外一个这里的复杂性(也许是好的,也许不是)。

The best piece of advice I can give you is test, test, test. Don't make assumptions about how CPUs will perform - there is a huge amount of magic going on in CPUs these days, often with counterintuitive results. Note also that the JIT runtime optimization will add an additional layer of complexity here (maybe good, maybe not).

这篇关于我应该在Java程序中使用多少个线程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆