并行化 Numpy 向量操作 [英] Parallelizing a Numpy vector operation

查看:34
本文介绍了并行化 Numpy 向量操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们使用,例如,numpy.sin()

以下代码将返回数组a的每个值的正弦值:

The following code will return the value of the sine for each value of the array a:

import numpy
a = numpy.arange( 1000000 )
result = numpy.sin( a )

但是我的机器有 32 个内核,所以我想使用它们.(对于诸如 numpy.sin() 之类的东西,开销可能不值得,但我实际想要使用的函数要复杂得多,而且我将处理大量数据.)

But my machine has 32 cores, so I'd like to make use of them. (The overhead might not be worthwhile for something like numpy.sin() but the function I actually want to use is quite a bit more complicated, and I will be working with a huge amount of data.)

这是最好的(阅读:最聪明还是最快)方法:

Is this the best (read: smartest or fastest) method:

from multiprocessing import Pool
if __name__ == '__main__':
    pool = Pool()
    result = pool.map( numpy.sin, a )

或者有更好的方法吗?

推荐答案

更好的方法:numexpr

从他们的主页稍微改写:

Slightly reworded from their main page:

这是一个用 C 语言编写的多线程虚拟机,可以分析表达式,更有效地重写它们,并将它们动态编译成代码,为内存和 CPU 有界操作获得接近最佳的并行性能.

It's a multi-threaded VM written in C that analyzes expressions, rewrites them more efficiently, and compiles them on the fly into code that gets near optimal parallel performance for both memory and cpu bounded operations.

例如,在我的 4 核机器中,计算正弦值的速度仅比 numpy 快不到 4 倍.

For example, in my 4 core machine, evaluating a sine is just slightly less than 4 times faster than numpy.

In [1]: import numpy as np
In [2]: import numexpr as ne
In [3]: a = np.arange(1000000)
In [4]: timeit ne.evaluate('sin(a)')
100 loops, best of 3: 15.6 ms per loop    
In [5]: timeit np.sin(a)
10 loops, best of 3: 54 ms per loop

文档,包括支持的函数此处.您必须检查或向我们提供更多信息,以查看您的更复杂的函数是否可以由 numexpr 计算.

Documentation, including supported functions here. You'll have to check or give us more information to see if your more complicated function can be evaluated by numexpr.

这篇关于并行化 Numpy 向量操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆