嵌套Numpy数组上的Numba [英] Numba on nested Numpy arrays
问题描述
设置
Setup
我有以下两种矩阵计算实现方式:
I have the following two implementations of a matrix-calculation:
- 第一个实现使用
matrix of shape (n, m)
,并且在for循环中重复计算repetition
次:
- The first implementation uses a
matrix of shape (n, m)
and the calculation is repeated in a for-loop forrepetition
-times:
import numpy as np
from numba import jit
@jit
def foo():
for i in range(1, n):
for j in range(1, m):
_deleteA = (
matrix[i, j] +
#some constants added here
)
_deleteB = (
matrix[i, j-1] +
#some constants added here
)
matrix[i, j] = min(_deleteA, _deleteB)
return matrix
repetition = 3
for x in range(repetition):
foo()
2.第二种实现避免了额外的for循环,因此将repetition = 3
包含在矩阵中,然后将其包含在shape (repetition, n, m)
中:
2. The second implementation avoids the extra for-loop and, hence, includes repetition = 3
into the matrix, which is then of shape (repetition, n, m)
:
@jit
def foo():
for i in range(1, n):
for j in range(1, m):
_deleteA = (
matrix[:, i, j] +
#some constants added here
)
_deleteB = (
matrix[:, i, j-1] +
#some constants added here
)
matrix[:, i, j] = np.amin(np.stack((_deleteA, _deleteB), axis=1), axis=1)
return matrix
问题
Questions
关于这两种实现,我发现了关于它们在iPython中使用%timeit
的性能的两件事.
Regarding both implementations, I discovered two things regarding their performance with %timeit
in iPython.
- 第一个实现从
@jit
中获得了可观的利润,而第二个实现则根本没有(在我的测试用例中为28ms vs. 25sec). 有人能想象为什么@jit
不能用于形状为(repetition, n, m)
的numpy数组吗?
- The first implementation profits hugely from
@jit
, while the second does not at all (28ms vs. 25sec in my testcase). Can anybody imagine why@jit
does not work anymore with a numpy-array of shape(repetition, n, m)
?
编辑
Edit
我将先前的第二个问题移至额外的帖子因为提出多个问题被认为是糟糕的SO风格.
I moved the former second question to an extra post since asking multiple questions is concidered bad SO-style.
问题是:
- 忽略
@jit
时,第一个实现仍然要快得多(相同的测试用例:17秒vs. 26秒). 为什么在三维而不是二维上工作时,numpy的速度会变慢?
- When neglecting
@jit
, the first implementation is still a lot faster (same test-case: 17sec vs. 26sec). Why is numpy slower when working on three instead of two dimensions?
推荐答案
我不确定您的设置在这里,但是我稍微重写了您的示例:
I'm not sure what your setup is here, but I re-wrote your example slightly:
import numpy as np
from numba import jit
#@jit(nopython=True)
def foo(matrix):
n, m = matrix.shape
for i in range(1, n):
for j in range(1, m):
_deleteA = (
matrix[i, j] #+
#some constants added here
)
_deleteB = (
matrix[i, j-1] #+
#some constants added here
)
matrix[i, j] = min(_deleteA, _deleteB)
return matrix
foo_jit = jit(nopython=True)(foo)
然后是时间:
m = np.random.normal(size=(100,50))
%timeit foo(m) # in a jupyter notebook
# 2.84 ms ± 54.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit foo_jit(m) # in a jupyter notebook
# 3.18 µs ± 38.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
因此,numba的速度比预期的要快得多.要考虑的一件事是全局numpy数组在numba中的行为不像您期望的那样:
So here numba is a lot faster as expected. One thing to consider is that global numpy arrays do not behave in numba as you might expect:
通常最好像我在示例中那样传递数据.
It's usually better to pass in the data as I did in the example.
第二种情况下的问题是numba目前不支持amin
.参见:
Your issue in the second case is that numba does not support amin
at this time. See:
https://numba.pydata.org/numba-doc /dev/reference/numpysupported.html
如果将nopython=True
传递给jit
,则可以看到此内容.因此,在当前版本的numba(当前为0.44或更早版本)中,它会退回到objectmode
,这通常不会比不使用numba快,但有时会变慢,因为存在一些调用开销.
You can see this if you pass nopython=True
to jit
. So in current versions of numba (0.44 or earlier at current), it will fall back to objectmode
which often is no faster than not using numba and sometimes is slower since there is some call overhead.
这篇关于嵌套Numpy数组上的Numba的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!