Python multiprocessing.Pool()不使用每个CPU的100% [英] Python multiprocessing.Pool() doesn't use 100% of each CPU

查看:550
本文介绍了Python multiprocessing.Pool()不使用每个CPU的100%的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python进行多处理. 例如,请考虑Python多重处理文档(在示例中,我将100更改为1000000,只是为了浪费更多时间).当我运行此命令时,我确实看到Pool()正在使用所有4个进程,但是我看不到每个CPU都移动到100%.如何使每个CPU的使用率达到100%?

I am working on multiprocessing in Python. For example, consider the example given in the Python multiprocessing documentation (I have changed 100 to 1000000 in the example, just to consume more time). When I run this, I do see that Pool() is using all the 4 processes but I don't see each CPU moving upto 100%. How to achieve the usage of each CPU by 100%?

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)            
    result = pool.map(f, range(10000000))  

推荐答案

这是因为multiprocessing需要后台主进程与工作进程之间的进程间通信,并且通信开销花费了更多(挂钟)时间比您的情况下的实际"计算(x * x).

It is because multiprocessing requires interprocess communication between the main process and the worker processes behind the scene, and the communication overhead took more (wall-clock) time than the "actual" computation (x * x) in your case.

请尝试使用较重"的计算内核,例如

Try "heavier" computation kernel instead, like

def f(x):
  return reduce(lambda a, b: math.log(a+b), xrange(10**5), x)

更新(澄清)

我指出,OP观察到的低CPU使用率是由于multiprocessing固有的IPC开销引起的,但是OP不必为此担心太多,因为原始的计算内核太轻"了用作基准.换句话说,multiprocessing在太轻"的内核中表现最差.如果OP在multiprocessing之上实现了真实世界的逻辑(我敢肯定,它会比x * x重一些"),那么我将保证OP的效率很高.我的论证得到了我提出的重"内核的实验的支持.

Update (clarification)

I pointed out that the low CPU usage observed by the OP was due to the IPC overhead inherent in multiprocessing but the OP didn't need to worry about it too much because the original computation kernel was way too "light" to be used as a benchmark. In other words, multiprocessing works the worst with such a way too "light" kernel. If the OP implements a real-world logic (which, I'm sure, will be somewhat "heavier" than x * x) on top of multiprocessing, the OP will achieve a decent efficiency, I assure. My argument is backed up by an experiment with the "heavy" kernel I presented.

@FilipMalczak,我希望我的澄清对您有意义.

@FilipMalczak, I hope my clarification makes sense to you.

通过某些方式,可以在使用multiprocessing时提高x * x的效率.例如,除非需要实时解决每个作业(例如,如果您实现REST API服务器,我们就不应该这样做),否则在将其提交给Pool之前,我们可以将1,000个作业合并为一个.

By the way there are some ways to improve the efficiency of x * x while using multiprocessing. For example, we can combine 1,000 jobs into one before we submit it to Pool unless we are required to solve each job in real time (ie. if you implement a REST API server, we shouldn't do in this way).

这篇关于Python multiprocessing.Pool()不使用每个CPU的100%的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆