使用多处理模块时如何提高CPU利用率? [英] How can I improve CPU utilization when using the multiprocessing module?
问题描述
我正在使用Python 3.4,对内存中的分区数据执行幼稚搜索,并试图派生进程以利用所有可用的处理能力.我之所以说是天真的,是因为我敢肯定还有其他一些事情可以做,以提高性能,但是这些潜力超出了当前问题的范围.
I am working in Python 3.4, performing a naive search against partitioned data in memory, and am attempting to fork processes to take advantage of all available processing power. I say naive, because I am certain there are other additional things that can be done to improve performance, but those potentials are out of scope for the question at hand.
我正在测试的系统是Windows 7 x64环境.
The system I am testing on is a Windows 7 x64 environment.
我想实现的是在cpu_count() - 1
个内核之间相对均匀,同时的分布(阅读表明,由于基线os系统进程,针对所有内核而不是n-1个内核进行分配不会显示任何其他改进).因此,对于4核计算机,有75%的钉住cpu的使用率.
What I would like to achieve is a relatively even, simultaneous distribution across cpu_count() - 1
cores (reading suggests that distributing against all cores rather than n-1 cores does not show any additional improvement due to baseline os system processes). So 75% pegged cpu Usage for a 4 core machine.
我所看到的(使用Windows任务管理器的性能选项卡"和进程选项卡")是我从未达到大于25%的系统专用CPU使用率,并且该进程视图显示了计算结果一次出现一个核心,在分支的过程之间每隔几秒钟切换一次.
What I am seeing (using windows task manager 'performance tab' and the 'process tab') is that I never achieve greater than 25% system dedicated cpu utilization and that the process view shows computation occurring one core at a time, switching every few seconds between the forked processes.
我还没有使用代码来计时,但是我很确定我的主观观察是正确的,因为我没有获得我期望的性能提升(在i5 3320m上为3倍).
I haven't instrumented the code for timing, but I am pretty sure that my subjective observations are correct in that I am not gaining the performance increase I expected (3x on an i5 3320m).
我还没有在Linux上进行测试.
I haven't tested on Linux.
基于提供的代码: -如何达到75%的CPU使用率?
Based on the code presented: - How can I achieve 75% CPU utilization?
#pseudo code
def search_method(search_term, partition):
<perform fuzzy search>
return results
partitions = [<list of lists>]
search_terms = [<list of search terms>]
#real code
import multiprocessing as mp
pool = mp.Pool(processes=mp.cpu_count() - 1)
for search_term in search_terms:
results = []
results = [pool.apply(search_method, args=(search_term, partitions[x])) for x in range(len(partitions))]
推荐答案
实际上,您在这里并没有做任何事情,因为您正在使用pool.apply
,它将阻塞直到传递给它的任务完成为止.因此,对于partitions
中的每个项目,您都在pool
内部的某个进程中运行search_method
,等待其完成,然后继续进行下一个项目.这与您在Windows进程管理器中看到的完全吻合.您要使用 pool.apply_async
:>
You're actually not doing anything concurrently here, because you're using pool.apply
, which will block until the task you pass to it is complete. So, for every item in partitions
, you're running search_method
in some process inside of pool
, waiting for it to complete, and then moving on to the next item. That perfectly coincides with what you're seeing in the Windows process manager. You want pool.apply_async
instead:
for search_term in search_terms:
results = []
results = [pool.apply_async(search_method, args=(search_term, partitions[x])) for x in range(len(partitions))]
# Get the actual results from the AsyncResult objects returned.
results = [r.get() for r in results]
或更妙的是,使用 pool.map
(连同 functools.partial
一起启用传递多个参数到我们的工作人员功能):
Or better yet, use pool.map
(along with functools.partial
to enable passing multiple arguments to our worker function):
from functools import partial
...
for search_term in search_terms:
func = partial(search_method, search_term)
results = pool.map(func, partitions)
这篇关于使用多处理模块时如何提高CPU利用率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!