numpy np.apply_along_axis函数加速吗? [英] numpy np.apply_along_axis function speed up?

查看:84
本文介绍了numpy np.apply_along_axis函数加速吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

np.apply_along_axis()函数似乎非常慢(15分钟后没有输出).有没有一种快速的方法可以在长数组上执行此功能而不必并行化操作?我特别在谈论具有数百万个元素的数组.

The np.apply_along_axis() function seems to be very slow (no output after 15 mins). Is there a fast way to perform this function on a long array without having to parallelize the operation? I am specifically talking about arrays with millions of elements.

这是我要做的事的一个例子.请忽略my_func的简单定义,其目标不是将数组乘以55(当然也可以在适当的地方完成),而是一个示例.实际上,my_func稍微复杂一点,需要额外的参数,因此a的每个元素的修改方式都不同,即,不只是乘以55.

Here is an example of what I am trying to do. Please ignore the simplistic definition of my_func, the goal is not to multiply the array by 55 (which of course can be done in place anyway) but an illustration. In practice, my_func is a little more complicated, takes extra arguments and as a result each element of a is modified differently, i.e. not just multiplied by 55.

>>> def my_func(a):
...     return a[0]*55
>>> a = np.ones((200000000,1))
>>> np.apply_along_axis(my_func, 1, a)

a = np.ones((20,1))

def my_func(a, i,j):
...     b = np.zeros((2,2))
...     b[0,0] = a[i]
...     b[1,0] = a[i]
...     b[0,1] = a[i]
...     b[1,1] = a[j]
...     return  linalg.eigh(b)


>>> my_func(a,1,1)
(array([ 0.,  2.]), array([[-0.70710678,  0.70710678],
   [ 0.70710678,  0.70710678]]))

推荐答案

np.apply_along_axis不是为了提高速度.

np.apply_along_axis is not for speed.

无法将纯Python 函数应用于Numpy数组的每个元素,而无需多次调用它,除非AST重写...

There is no way to apply a pure Python function to every element of a Numpy array without calling it that many times, short of AST rewriting...

幸运的是,有解决方案:

Fortunately, there are solutions:

  • 矢量化

尽管这通常很难,但通常是简单的解决方案.寻找某种方式来泛化元素,从而表达您的计算,以便您可以一次处理整个矩阵.这将导致循环从Python中断,并进入经过高度优化的C和Fortran例程.

Although this is often hard, it's normally the easy solution. Find some way to express your calculation in a way that generalizes over the elements, so you can work on the whole matrix at once. This will result in the loops being hoisted out of Python and in to heavily optimised C and Fortran routines.

准星: Numba Parakeet ,并在较小程度上用 PyPy NumPyPy

JITing: Numba and Parakeet, and to a lesser extent PyPy with NumPyPy

Numba和Parakeet都处理Numpy数据结构上的JITing循环,因此,如果将循环内联到一个函数(这可以是包装函数)中,则可以在以下方面获得巨大的速度提升:自由.不过,这取决于所使用的数据结构.

Numba and Parakeet both deal with JITing loops over Numpy data structures, so if you inline the looping into a function (this can be a wrapper function), you can get massive speed boosts for almost-free. This depends on the data structures used, though.

符号评估器,例如 Theano numexpr

使用这些语言,您可以使用嵌入式语言来表达计算结果,甚至比矢量化版本还快得多.

These allow you to use embedded languages to express calculations, which can end up much faster than even the vectorized versions.

Cython C扩展名

如果所有其他内容都丢失了,您可以随时手动查找C.Cython隐藏了很多复杂性,也拥有很多可爱的魔术,因此它并不总是那么糟糕(尽管它有助于了解您的身份)在做).

If all else is lost, you can always dig down manually to C. Cython hides a lot of the complexity and has a lot of lovely magic too, so it's not always that bad (although it helps to know what you're doing).

你在这里.

这是我测试的环境"(您应该确实提供了:P):

This is my testing "environment" (you should really have provided this :P):

import itertools
import numpy

a = numpy.arange(200).reshape((200,1)) ** 2

def my_func(a, i,j):
    b = numpy.zeros((2,2))
    b[0,0] = a[i]
    b[1,0] = a[i]
    b[0,1] = a[i]
    b[1,1] = a[j]
    return  numpy.linalg.eigh(b)

eigvals = {}
eigvecs = {}

for i, j in itertools.combinations(range(a.size), 2):
    eigvals[i, j], eigvecs[i, j] = my_func(a,i,j)

现在,获取所有排列而不是组合要容易得多,因为您可以执行以下操作:

Now, it's far easier to get all the permutations instead of the combinations, because you can just do this:

# All *permutations*, not combinations
indexes = numpy.mgrid[:a.size, :a.size]

这似乎很浪费,但是排列只有两倍,所以没什么大不了的.

This might seem wasteful, but there are only twice as many permutations so it's not a big deal.

所以我们要使用这些索引来获取相关元素:

So we want to use these indexes to get the relevant elements:

# Remove the extra dimension; it's not wanted here!
subs = a[:,0][indexes]

然后我们可以建立矩阵:

and then we can make our matrices:

target = numpy.array([
    [subs[0], subs[0]],
    [subs[0], subs[1]]
])

我们需要矩阵位于 last 的两个维度中:

We need the matrices to be in the last two dimensions:

target.shape
#>>> (2, 2, 200, 200)

target = numpy.swapaxes(target, 0, 2)
target = numpy.swapaxes(target, 1, 3)

target.shape
#>>> (200, 200, 2, 2)

我们可以检查它是否有效:

And we can check that it works:

target[10, 20]
#>>> array([[100, 100],
#>>>        [100, 400]])

是的!

所以我们只运行numpy.linalg.eigh:

values, vectors = numpy.linalg.eigh(target)

看,它行得通!

values[10, 20]
#>>> array([  69.72243623,  430.27756377])

eigvals[10, 20]
#>>> array([  69.72243623,  430.27756377])

因此,我想您可能希望将这些连接起来:

So then I'd imagine you might want to concatenate these:

numpy.concatenate([values[row, row+1:] for row in range(len(values))])
#>>> array([[  0.00000000e+00,   1.00000000e+00],
#>>>        [  0.00000000e+00,   4.00000000e+00],
#>>>        [  0.00000000e+00,   9.00000000e+00],
#>>>        ..., 
#>>>        [  1.96997462e+02,   7.78160025e+04],
#>>>        [  3.93979696e+02,   7.80160203e+04],
#>>>        [  1.97997475e+02,   7.86070025e+04]])

numpy.concatenate([vectors[row, row+1:] for row in range(len(vectors))])
#>>> array([[[ 1.        ,  0.        ],
#>>>         [ 0.        ,  1.        ]],
#>>> 
#>>>        [[ 1.        ,  0.        ],
#>>>         [ 0.        ,  1.        ]],
#>>> 
#>>>        [[ 1.        ,  0.        ],
#>>>         [ 0.        ,  1.        ]],
#>>> 
#>>>        ..., 
#>>>        [[-0.70890372,  0.70530527],
#>>>         [ 0.70530527,  0.70890372]],
#>>> 
#>>>        [[-0.71070503,  0.70349013],
#>>>         [ 0.70349013,  0.71070503]],
#>>> 
#>>>        [[-0.70889463,  0.7053144 ],
#>>>         [ 0.7053144 ,  0.70889463]]])

也可以在numpy.mgrid之后执行此串联循环以将工作量减半:

It's also possible to do this concatenate loop just after numpy.mgrid to halve the amount of work:

# All *permutations*, not combinations
indexes = numpy.mgrid[:a.size, :a.size]

# Convert to all *combinations* and reduce the dimensionality
indexes = numpy.concatenate([indexes[:, row, row+1:] for row in range(indexes.shape[1])], axis=1)

# Remove the extra dimension; it's not wanted here!
subs = a[:,0][indexes]

target = numpy.array([
    [subs[0], subs[0]],
    [subs[0], subs[1]]
])

target = numpy.rollaxis(target, 2)

values, vectors = numpy.linalg.eigh(target)

是的,最后一个示例就是您所需要的.

Yeah, that last sample is all you need.

这篇关于numpy np.apply_along_axis函数加速吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆