为什么将多线程应用程序限制在一个内核中会使其运行更快? [英] Why is it that restricting multithreaded applications to one core make it run faster?

查看:116
本文介绍了为什么将多线程应用程序限制在一个内核中会使其运行更快?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个用C ++编写的本机多线程 Win32 应用程序,该应用程序具有大约3个相对繁忙的线程和4至6个线程并没有做那么多的事情.在正常模式下运行时,在8核计算机上,CPU的总使用率总计约为15%,并且该应用程序在大约30秒内完成.当我通过将亲和力掩码设置为0x01将应用程序限制为仅一个内核时,它可以在23秒内更快地完成.

I have a native multithreaded Win32 application written in C++ which has about 3 relatively busy threads and 4 to 6 threads that don't do that much. When it runs in a normal mode total CPU usage adds up to about 15% on an 8-core machine and the application finished in about 30 seconds. And when I restrict the application to only one core by setting the affinity mask to 0x01 it completes faster, in 23 seconds.

我猜想这与同步限制在一个物理核心上和/或某些并发内存访问问题时便宜有关.

I'm guessing it has something to do with the synchronization being cheaper when restricted to one physical core and/or some concurrent memory access issues.

我正在运行Windows 7 x64,应用程序是32位. CPU是Xeon X5570,具有4核和HT.

I'm running Windows 7 x64, application is 32-bit. The CPU is Xeon X5570 with 4 cores and HT enabled.

有人可以详细解释这种行为吗?为什么会发生这种情况,以及如何提前预测这种行为?

Could anyone explain that behavior in detail? Why that happens and how to predict that kind of behavior ahead of time?

更新:我想我的问题不是很清楚.我想知道为什么它在一个物理内核上变得更快,而不是为什么它在多个内核上没有超过15%.

Update: I guess my question wasn't very clear. I would like to know why it gets faster on one physical core, not why it doesn't get above 15% on multiple cores.

推荐答案

没有说明应用程序,很难仅猜测是什么原因导致了应用程序运行缓慢.如果您想进行详细分析,我们可以考虑以下因素-

Without stating the application it is difficult to just guess what is causing the slow running of the application. If you want to go for a detailed analysis, we can consider following factors -

  • 处理器间通信:您的应用程序中的线程相互通信的数量.如果他们经常交流,那么由于这种行为,您将负担额外的费用

  • InterProcessor Communication : How much the threads in your application communicate with each other. If they communicate very often, then you will have overhead due to this behavior

处理器缓存体系结构:这是另一个值得一看的重要因素.您应该知道由于在不同处理器上运行的线程将如何影响处理器的缓存.共享缓存将发生多少颠簸.

Processor Cache Architecture : This is another important factor to see. You should know how the caches of the processor are going to be affected due to threads running on different processor. How much thrashing is going to happen at shared caches.

页面错误:由于程序的顺序性质,也许在单处理器上运行会导致更少的页面错误?

Page Faults : Maybe running on single processor is causing less number of page faults due to sequential nature of your program?

锁定:锁定代码中的开销吗?这不应导致速度下降.但是除了上述因素之外,这可能还会增加一些开销.

Locks : Lock overheads in your code? This should not cause a slowdown. But in addition to the above mentioned factors, this might add up to some overhead.

处理器上的NoC :毫无疑问,如果您将不同的线程分配给不同的处理器核心,并且它们正在通信,那么您需要知道它们采用的路径是什么.它们之间是否有专用连接?也许您应该看看链接.

NoC on the processor : Definitely, if you allocate different threads to different processor cores, and they are communicating, then you need to know what is the path they are taking. Is there a dedicated connection between them? Perhaps you should have a look at this link.

处理器负载:最后但并非最不重要的一点是,我希望您没有在其他处理器内核上运行其他任务,从而导致大量上下文切换.上下文切换通常非常昂贵.

Processor Load : Last but not the least is that, I hope you are not having other tasks running on other processor cores, causing a lot of context-switches. Context switch is typically very expensive.

温度:您应该考虑的一个影响是,如果cpu内核正在变热,则处理器时钟会变慢.我认为,您不会产生这种影响,但是在很大程度上还取决于环境温度.

Temperature : One effect you should consider is of the processor clock being slowed down if the cpu core is heating up. I think, you will not have this effect, but it also largely depends on the ambient temperature.

这篇关于为什么将多线程应用程序限制在一个内核中会使其运行更快?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆