Python multiprocessing.Pool() 不会使用 100% 的每个 CPU [英] Python multiprocessing.Pool() doesn't use 100% of each CPU

查看:53
本文介绍了Python multiprocessing.Pool() 不会使用 100% 的每个 CPU的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究 Python 中的多处理.例如,考虑 Python 多处理中给出的示例 文档(我在示例中将 100 更改为 1000000,只是为了消耗更多时间).当我运行它时,我确实看到 Pool() 正在使用所有 4 个进程,但我没有看到每个 CPU 都移动到 100%.如何实现每个CPU的使用率100%?

I am working on multiprocessing in Python. For example, consider the example given in the Python multiprocessing documentation (I have changed 100 to 1000000 in the example, just to consume more time). When I run this, I do see that Pool() is using all the 4 processes but I don't see each CPU moving upto 100%. How to achieve the usage of each CPU by 100%?

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)            
    result = pool.map(f, range(10000000))  

推荐答案

是因为multiprocessing需要主进程和后台工作进程进行进程间通信,通信开销比较大(挂钟)时间比你的情况下的实际"计算(x * x).

It is because multiprocessing requires interprocess communication between the main process and the worker processes behind the scene, and the communication overhead took more (wall-clock) time than the "actual" computation (x * x) in your case.

尝试更重"的计算内核,比如

Try "heavier" computation kernel instead, like

def f(x):
  return reduce(lambda a, b: math.log(a+b), xrange(10**5), x)

更新(澄清)

我指出 OP 观察到的低 CPU 使用率是由于 multiprocessing 中固有的 IPC 开销,但 OP 不需要过多担心,因为原始计算内核是太轻"了,不能用作基准.换句话说,multiprocessing 使用这种方式过于轻"的内核效果最差.如果 OP 在 multiprocessing 之上实现了一个真实世界的逻辑(我敢肯定,它会比 x * x 更重"),那么 OP 将我保证,达到不错的效率.我提出的重"内核实验支持了我的论点.

Update (clarification)

I pointed out that the low CPU usage observed by the OP was due to the IPC overhead inherent in multiprocessing but the OP didn't need to worry about it too much because the original computation kernel was way too "light" to be used as a benchmark. In other words, multiprocessing works the worst with such a way too "light" kernel. If the OP implements a real-world logic (which, I'm sure, will be somewhat "heavier" than x * x) on top of multiprocessing, the OP will achieve a decent efficiency, I assure. My argument is backed up by an experiment with the "heavy" kernel I presented.

@FilipMalczak,我希望我的澄清对你有意义.

@FilipMalczak, I hope my clarification makes sense to you.

顺便说一下,在使用multiprocessing时有一些方法可以提高x * x的效率.例如,我们可以将 1,000 个作业合并为一个,然后将其提交到 Pool,除非我们需要实时解决每个作业(即,如果您实现 REST API 服务器,我们不应该这样做以这种方式).

By the way there are some ways to improve the efficiency of x * x while using multiprocessing. For example, we can combine 1,000 jobs into one before we submit it to Pool unless we are required to solve each job in real time (ie. if you implement a REST API server, we shouldn't do in this way).

这篇关于Python multiprocessing.Pool() 不会使用 100% 的每个 CPU的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆