为什么numpy计算不受全局解释器锁定的影响? [英] Why are numpy calculations not affected by the global interpreter lock?

查看:121
本文介绍了为什么numpy计算不受全局解释器锁定的影响?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试决定是否应该使用多处理或线程,并且我已经学到了一些关于博客文章中,看来多线程不适合繁忙的任务.但是,我还了解到,某些功能(例如I/O或numpy)不受GIL的影响.

I'm trying to decide if I should use multiprocessing or threading, and I've learned some interesting bits about the Global Interpreter Lock. In this nice blog post, it seems multithreading isn't suitable for busy tasks. However, I also learned that some functionality, such as I/O or numpy, is unaffected by the GIL.

任何人都可以解释原因,以及如何确定我的(可能很笨拙的)代码是否适合多线程吗?

Can anyone explain why, and how I can find out if my (probably quite numpy-heavy) code is going to be suitable for multithreading?

推荐答案

许多 numpy计算不受GIL的影响,但不是全部.

Many numpy calculations are unaffected by the GIL, but not all.

虽然在不需要Python解释器的代码(例如C库)中,可以专门释放GIL-允许依赖于解释器的其他代码继续运行.在Numpy C代码库中,宏NPY_BEGIN_THREADSNPY_END_THREADS用于定界允许GIL释放的代码块.您可以在对numpy源进行的搜索中看到这些.

While in code that does not require the Python interpreter (e.g. C libraries) it is possible to specifically release the GIL - allowing other code that depends on the interpreter to continue running. In the Numpy C codebase the macros NPY_BEGIN_THREADS and NPY_END_THREADS are used to delimit blocks of code that permit GIL release. You can see these in this search of the numpy source.

NumPy C API文档有关线程支持的更多信息.请注意,附加宏NPY_BEGIN_THREADS_DESCRNPY_END_THREADS_DESCRNPY_BEGIN_THREADS_THRESHOLDED用于处理条件性GIL释放,具体取决于数组dtypes和循环的大小.

The NumPy C API documentation has more information on threading support. Note the additional macros NPY_BEGIN_THREADS_DESCR, NPY_END_THREADS_DESCR and NPY_BEGIN_THREADS_THRESHOLDED which handle conditional GIL release, dependent on array dtypes and the size of loops.

大多数核心功能都会发布GIL,例如通用功能(ufunc) 如上所述:

Most core functions release the GIL - for example Universal Functions (ufunc) do so as described:

只要不涉及对象数组,就在调用循环之前释放Python全局解释器锁(GIL).如有必要,需要重新获取它.

as long as no object arrays are involved, the Python Global Interpreter Lock (GIL) is released prior to calling the loops. It is re-acquired if necessary to handle error conditions.

关于您自己的代码,可以使用 NumPy的源代码 .检查上面的宏所使用的功能(以及它们调用的功能).还请注意,性能收益在很大程度上取决于GIL的发布时间 -如果您的代码不断地插入/退出Python,您将不会看到太多改进.

With regard to your own code, the source code for NumPy is available. Check the functions you use (and the functions they call) for the above macros. Note also that the performance benefit is heavily dependent on how long the GIL is released - if your code is constantly dropping in/out of Python you won't see much of an improvement.

另一种选择是对其进行测试.但是,请记住,使用条件GIL宏的函数可能会在大小数组上表现出不同的行为.因此,使用较小的数据集进行的测试可能无法准确表示较大任务的性能.

The other option is to just test it. However, bear in mind that functions using the conditional GIL macros may exhibit different behaviour with small and large arrays. A test with a small dataset may therefore not be an accurate representation of performance for a larger task.

官方Wiki上有有关numpy的并行处理的其他信息和有关Python GIL的有用文章,通常在Programmers.SE 上.

There is some additional information on parallel processing with numpy available on the official wiki and a useful post about the Python GIL in general over on Programmers.SE.

这篇关于为什么numpy计算不受全局解释器锁定的影响?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆