Matlab:-maxNumCompThreads,超线程和parpool [英] Matlab: -maxNumCompThreads, hyperthreading, and parpool
问题描述
我正在具有20个核心和超线程功能的Linux集群中的一个节点上运行Matlab R2014a.我知道之前已经讨论过了,但是我正在寻求澄清.这是我对Matlab中的线程与内核问题的理解:
I'm running Matlab R2014a on a node in a Linux cluster that has 20 cores and hyperthreading enabled. I know this has been discussed before, but I'm looking for some clarification. Here's what my understanding is of the threads vs. cores issue in Matlab:
- Matlab具有固有的多线程功能,并将在多核计算机上利用额外的内核.
- Matlab以这样的方式运行其线程:将多个Matlab线程放在同一内核上(即超线程)没有用.因此,默认情况下,Matlab将创建的最大线程数是系统上的内核数.
- 使用parpool()时,无论您创建多少工作人员,每个工作人员将仅使用一个物理核心,如但是,我还读到了使用(不建议使用的)函数maxNumCompThreads(),您可以减少或增加Matlab或其中一个worker生成的线程数.在某些情况下这可能很有用:
However, I've also read that using the (deprecated) function maxNumCompThreads(), you can either decrease or increase the number of threads that Matlab or one of the workers will generate. This can be useful in several scenarios:
- 您想利用Matlab的隐式多线程功能在群集节点上运行某些代码,而无需分配整个节点.如果将maxNumCompThreads删除,还有其他方法可以做到这一点.
- 您希望进行参数扫描,但参数少于计算机上的内核数.在这种情况下,您可能希望增加每个工作线程的线程数,以便利用您的所有内核.最近在此线程.但是,以我的经验来看,虽然单个工作人员似乎很乐意使用maxNumCompThreads()增加线程数,但使用"top"命令检查实际的CPU使用率表明这没有任何效果,即每个工作人员仍然使用一个核心.可能发生的情况是由parpool生成的各个Matlab进程使用-singleCompThread参数运行.我已经确认,如果父Matlab进程是使用-singleCompThread运行的,则由于Matlab在单线程模式下运行,因此命令maxNumCompThreads(n)(其中n> 1会引发错误).因此,结果似乎是(至少在2014a中),您无法增加并行池工作程序上的计算线程数.与此相关的是,即使计算机本身启用了超线程,我似乎也无法让Parent matlab进程启动比核心更多的线程.同样,它将愉快地运行maxNumCompThreads(n),其中n>#个物理内核,但是top显示CPU利用率为50%的事实表明并非如此.那是怎么回事,还是我误会了什么?
- You want to utilize Matlab's implicit multithreading capabilities to run some code on a cluster node without allocating the entire node. It would be nice if there was some other way to do this if maxNumCompThreads ever gets removed.
- You want to do a parameter sweep but have less parameters than the number of cores on your machine. In this case you might want to increase the number of threads per worker so that all of your cores are utilized. This was suggested recently in this thread. However, in my experience, while the individual workers seem quite happy to use maxNumCompThreads() to increase their thread count, inspecting the actual CPU usage using the "top" command suggests that it doesn't have any effect, i.e. each worker still only gets to use one core. It's possible that what is happening is that the individual Matlab processes spawned by the parpool are run with the argument -singleCompThread. I've confirmed that if the parent Matlab process is run with -singleCompThread, the command maxNumCompThreads(n), where n > 1 throws an error due to the fact that Matlab is running in single threaded mode. So the result seems to be that (at least in 2014a), you can't increase the number of computational threads on the parallel pool workers. Related to this is that I can't seem to get the Parent matlab process to to start more threads than there are cores, even though the computer itself has hyperthreading enabled. Again, it will happily run maxNumCompThreads(n), where n > # physical cores, but the fact that top is showing CPU utilization to be 50% suggests otherwise. So what is happening, or what am I misunderstanding?
更明确地提出我的问题:
to lay out my questions more explicitly:
- 在parfor循环中,当n> 1似乎起作用时,为什么不设置maxNumCompThreads(n)?如果是因为辅助进程以-singleCompThread开始,为什么maxNumCompThreads()不返回以-singleCompThread开头的父进程那样的错误?
- 在父进程中,为什么不使用maxNumCompThreads(n),其中n>#个物理内核,会执行任何操作?
注意:我以前在Matlab答案上发布了此内容,但未收到任何反馈.
Note: I posted this previously on Matlab answers, but haven't received any feedback.
Edit2:(1)中的问题似乎与我使用的测试代码有关.
It looks like the problem in (1) was an issue with the test code I was using.
推荐答案
我错了,因为
maxNumCompThreads
不能在parpool worker上工作.看来问题出在我使用的代码是I was wrong about
maxNumCompThreads
not working on parpool workers. It looks like the problem was that the code I was using:parfor j = 1:2 tic maxNumCompThreads(2); workersCompThreads(j) = maxNumCompThreads; i = 1; while toc < 200 a = randn(10^i)*randn(10^i); i = i + 1; end end
在我检查CPU利用率时,
已经使用了太多内存,因此瓶颈是I/O,多余的线程已经被关闭.当我执行以下操作时:
used so much memory by the time I checked CPU utilization that the bottleneck was I/O and the extra threads were already shut down. When I did the following:
parfor j = 1:2 tic maxNumCompThreads(2); workersCompThreads(j) = maxNumCompThreads; i = 4; while toc < 200 a = randn(10^i)*randn(10^i); end end
额外的线程开始并保持运行.
The extra threads started and stayed running.
关于第二个问题,我从Mathworks获得了一个确认,即即使您明确地将限制提高到了这个极限,父Matlab进程启动的线程也不会超过物理内核的数量.因此,在文档中,句子是:
As for the second issue, I got a confirmation from the Mathworks that the parent Matlab process won't start more threads than the number of physical cores, even if you explicitly raise the limit beyond that. So in the documentation, the sentence:
当前,最大计算线程数等于您计算机上的计算核心数."
应该说:
当前,最大计算线程数等于计算机上的物理内核数."
"Currently, the maximum number of computational threads is equal to the number of physical cores on your machine."
这篇关于Matlab:-maxNumCompThreads,超线程和parpool的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!