Python 中的 multiprocessing.dummy 未使用 100% cpu [英] multiprocessing.dummy in Python is not utilising 100% cpu
问题描述
我正在用 Python 做一个机器学习项目,所以我必须做并行预测函数,我在我的程序中使用了它.
I am doing a machine learning project in Python, so I have to do parallel predict function, which I'm using in my program.
from multiprocessing.dummy import Pool
from multiprocessing import cpu_count
def multi_predict(X, predict, *args, **kwargs):
pool = Pool(cpu_count())
results = pool.map(predict, X)
pool.close()
pool.join()
return results
问题是我所有的 CPU 都只加载了 20-40%(总而言之是 100%).我使用 multiprocessing.dummy 是因为酸洗功能中的 multiprocessing 模块存在一些问题.
The problem is that all my CPUs loaded only on 20-40% (in sum it's 100%). I use multiprocessing.dummy because I have some problems with multiprocessing module in pickling function.
推荐答案
当你使用 multiprocessing.dummy
,您使用的是线程,而不是进程:
When you use multiprocessing.dummy
, you're using threads, not processes:
multiprocessing.dummy
复制了 multiprocessing
的 API 但不是不仅仅是 threading
模块的包装器.
multiprocessing.dummy
replicates the API ofmultiprocessing
but is no more than a wrapper around thethreading
module.
这意味着您受到全局解释器锁 (GIL) 的限制,并且只有一个线程可以实际执行 CPU一次绑定操作.这将使您无法充分利用 CPU.如果您想在所有可用内核之间实现完全并行,您将需要解决使用 multiprocessing.Pool
遇到的酸洗问题.
That means you're restricted by the Global Interpreter Lock (GIL), and only one thread can actually execute CPU-bound operations at a time. That's going to keep you from fully utilizing your CPUs. If you want get full parallelism across all available cores, you're going to need to address the pickling issue you're hitting with multiprocessing.Pool
.
请注意,如果您需要并行化的工作是 IO 绑定的,或者使用释放 GIL 的 C 扩展,则 multiprocessing.dummy
可能仍然有用.但是,对于纯 Python 代码,您需要 multiprocessing
.
Note that multiprocessing.dummy
might still be useful if the work you need to parallelize is IO bound, or utilizes a C-extension that releases the GIL. For pure Python code, however, you'll need multiprocessing
.
这篇关于Python 中的 multiprocessing.dummy 未使用 100% cpu的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!