嵌套Numpy数组上的Numba [英] Numba on nested Numpy arrays

查看:185
本文介绍了嵌套Numpy数组上的Numba的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

设置

Setup

我有以下两种矩阵计算实现方式:

I have the following two implementations of a matrix-calculation:

  1. 第一个实现使用matrix of shape (n, m),并且在for循环中重复计算repetition次:
  1. The first implementation uses a matrix of shape (n, m) and the calculation is repeated in a for-loop for repetition-times:

import numpy as np
from numba import jit

@jit
def foo():
    for i in range(1, n):
        for j in range(1, m):

            _deleteA = (
                        matrix[i, j] +
                        #some constants added here
            )
            _deleteB = (
                        matrix[i, j-1] +
                        #some constants added here
            )
            matrix[i, j] = min(_deleteA, _deleteB)

    return matrix

repetition = 3
for x in range(repetition):
    foo()


2.第二种实现避免了额外的for循环,因此将repetition = 3包含在矩阵中,然后将其包含在shape (repetition, n, m)中:


2. The second implementation avoids the extra for-loop and, hence, includes repetition = 3 into the matrix, which is then of shape (repetition, n, m):

@jit
def foo():
    for i in range(1, n):
        for j in range(1, m):

            _deleteA = (
                        matrix[:, i, j] +
                        #some constants added here
            )
            _deleteB = (
                        matrix[:, i, j-1] +
                        #some constants added here
            )
            matrix[:, i, j] = np.amin(np.stack((_deleteA, _deleteB), axis=1), axis=1)

    return matrix


问题


Questions

关于这两种实现,我发现了关于它们在iPython中使用%timeit的性能的两件事.

Regarding both implementations, I discovered two things regarding their performance with %timeit in iPython.

  1. 第一个实现从@jit中获得了可观的利润,而第二个实现则根本没有(在我的测试用例中为28ms vs. 25sec). 有人能想象为什么@jit不能用于形状为(repetition, n, m)的numpy数组吗?
  1. The first implementation profits hugely from @jit, while the second does not at all (28ms vs. 25sec in my testcase). Can anybody imagine why @jit does not work anymore with a numpy-array of shape (repetition, n, m)?


编辑


Edit

我将先前的第二个问题移至额外的帖子因为提出多个问题被认为是糟糕的SO风格.

I moved the former second question to an extra post since asking multiple questions is concidered bad SO-style.

问题是:

  1. 忽略@jit时,第一个实现仍然要快得多(相同的测试用例:17秒vs. 26秒). 为什么在三维而不是二维上工作时,numpy的速度会变慢?
  1. When neglecting @jit, the first implementation is still a lot faster (same test-case: 17sec vs. 26sec). Why is numpy slower when working on three instead of two dimensions?

推荐答案

我不确定您的设置在这里,但是我稍微重写了您的示例:

I'm not sure what your setup is here, but I re-wrote your example slightly:

import numpy as np
from numba import jit

#@jit(nopython=True)
def foo(matrix):
    n, m = matrix.shape
    for i in range(1, n):
        for j in range(1, m):

            _deleteA = (
                        matrix[i, j] #+
                        #some constants added here
            )
            _deleteB = (
                        matrix[i, j-1] #+
                        #some constants added here
            )
            matrix[i, j] = min(_deleteA, _deleteB)

    return matrix

foo_jit = jit(nopython=True)(foo)

然后是时间:

m = np.random.normal(size=(100,50))

%timeit foo(m)  # in a jupyter notebook
# 2.84 ms ± 54.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit foo_jit(m)  # in a jupyter notebook
# 3.18 µs ± 38.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

因此,numba的速度比预期的要快得多.要考虑的一件事是全局numpy数组在numba中的行为不像您期望的那样:

So here numba is a lot faster as expected. One thing to consider is that global numpy arrays do not behave in numba as you might expect:

https://numba.pydata.org/numba-doc/dev/user/faq.html#numba-doesn-t-seem-to-care-when-i-modify -a-global-variable

通常最好像我在示例中那样传递数据.

It's usually better to pass in the data as I did in the example.

第二种情况下的问题是numba目前不支持amin.参见:

Your issue in the second case is that numba does not support amin at this time. See:

https://numba.pydata.org/numba-doc /dev/reference/numpysupported.html

如果将nopython=True传递给jit,则可以看到此内容.因此,在当前版本的numba(当前为0.44或更早版本)中,它会退回到objectmode,这通常不会比不使用numba快,但有时会变慢,因为存在一些调用开销.

You can see this if you pass nopython=True to jit. So in current versions of numba (0.44 or earlier at current), it will fall back to objectmode which often is no faster than not using numba and sometimes is slower since there is some call overhead.

这篇关于嵌套Numpy数组上的Numba的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆