numba 渴望编译?图案是什么? [英] numba eager compilation? Whats the pattern?

查看:82
本文介绍了numba 渴望编译?图案是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 numba 的网站上查看了热切编译,但不知道如何指定类型:

I looked into eager compilation on numba's website and couldnt figure out, how to specify the types:

他们使用的例子是这样的:

The example they use is this:

from numba import jit, int32

@jit(int32(int32, int32))
def f(x, y):
    # A somewhat trivial example
    return x + y

# source: http://numba.pydata.org/numba-doc/latest/user/jit.html#eager-compilation

如您所见,它获取 2 个变量作为输入并返回一个变量.所有这些都应该是 int32.

as you can see it gets 2 variables as input and returns one single variable. all of them should be int32.

理解装饰器的一种方式是 @jit(int32(int32, int32)) 可以理解为:

One way to understand the decorator is that @jit(int32(int32, int32)) could be understood as:

@jit(type_of_returned_value(type_of_x, type_of_b))

如果这是对的(对吗?),那么您如何为多个输入和输出指定它?

If that is right (Is that right?), then how do you specify it for multiple inputs and outputs?

像这样:

@nb.jit
def filter3(a,b):
    return a > b 

@nb.jit
def func3(list_of_arrays_A, list_of_arrays_B, list_of_arrays_C, list_of_arrays_D, 2d_numpy_array_of_objects):

    for i in range(len(list_of_arrays_A)): 

        for j in range(list_of_arrays_A[i].size):
            
            if filter3(list_of_arrays_A[i][j],list_of_arrays_B[i][j]):
                2d_numpy_array_of_objects[i][j] = 1

            elif filter3(list_of_arrays_B[i][j],list_of_arrays_A[i][j]):
                2d_numpy_array_of_objects[i][j] = 0

            elif filter3(list_of_arrays_C[i][j],list_of_arrays_D[i][j]): 
                2d_numpy_array_of_objects[i][j] = 0
            else:                       
                2d_numpy_array_of_objects[i][j] = 1 
'''

My intention: Since i need to speed up a function which is only called once, (but takes forever if **not** done with numba), I need to speed up its numba-compilation 

推荐答案

始终可以使用 numba.typeof 干扰变量的类型.例如

One can always use numba.typeof to interfere type of a variable. e.g.

import numpy as np
import numba as nb
N=10000
simple_list= [np.zeros(1) for x in range(N)]
nb.typeof(simple_list)
# reflected list(array(float64, 1d, C))

或:

from numba.typed import List
typed_list=List()
for _ in range(N):
    typed_list.append(np.zeros(1))
nb.typeof(typed_list)
# ListType[array(float64, 1d, C)]

因此您可以提供如下签名以进行提前编译:

So you can provide the signatures as follows to ahead-of-time compilation:

@nb.jit([nb.void(nb.typeof(typed_list)),
         nb.void(nb.typeof(simple_list))])
def fun(lst):
    pass

值得注意的细节:

  • 我正在两个不同版本中提前编译函数:一个用于 numba 的 TypedList (nb.void(nb.typeof(typed_list)),另一个用于 python 的列表 (nb.void(nb.typeof(simple_list))).
  • 我不使用签名字符串,而是使用签名本身(例如描述的 此处),因为如果我理解正确,TypedList 或反射列表不存在签名字符串(更多信息如下).
  • 由于函数fun不返回任何东西,函数的返回类型是void,因此nb.void(...)在签名中.
  • I'm compiling the function ahead of time in two different versions: one for numba's TypedList (nb.void(nb.typeof(typed_list)) and one for python's list (nb.void(nb.typeof(simple_list))).
  • I don't use the signature strings , but the signatures themselves (for example described here), because there exists no signature string for TypedList or reflected list if I understand it correctly (more info follows bellow).
  • As the function fun doesn't return anything, the return type of the function is void, thus nb.void(...) in signatures.

然而,有趣的是 simple_list-version 有多少开销:

However, the interesting thing is how much more overhead simple_list-version has:

%timeit fun(simple_list)  # 185 ms ± 4.23 ms 
%timeit fun(typed_list)   # 1.18 µs ± 69.3 ns

即大约 1e5 的系数!原因也很清楚:为了检查传递的列表是否真的是 reflected list(array(float64, 1d, C)) 类型,numba 必须查看列表中的每个元素.另一方面,对于 TypedList,它要简单得多:列表中不能超过一种类型 - 无需遍历整个列表!

i.e. factor of about 1e5! It is also pretty clear why: in order to check that the passed list is really of type reflected list(array(float64, 1d, C)) numba has to look at every element in the list. On the other hand, for the TypedList it is much simpler: there cannot be no more than one type in the list - there is no need to iterate over the whole list!

因此,人们应该更喜欢创建和使用 TypedList s 而不仅仅是消除 弃用警告.

Thus one should prefer to create and use TypedLists and not just to dismiss a deprecation warning.

可能无法为 reflected listTypedList 提供字符串,因为现在 以下代码用于解析签名:

It is probably not possible to give a string for reflected list or TypedList, because right now the following code is used to parse the signature:

def _parse_signature_string(signature_str):
    # Just eval signature_str using the types submodules as globals
    return eval(signature_str, {}, types.__dict__)

并且因为 nb.types.__dict__ 没有 TypedListreflected list 我们不能通过字符串传递它们.

and because nb.types.__dict__ has no TypedList or reflected list we cannot pass them via string.

一旦函数被编译(提前或即时),就可以在相应的Dispatcher-object,例如通过:

Once the functions are compiled (ahead-of-time or just-in-time) it is possible to see the signatures in the corresponding Dispatcher-object, for example via:

[x.signature for x in fun.overloads.values()]
# [(ListType[array(float64, 1d, C)],) -> none,
#  (reflected list(array(float64, 2d, C)),) -> none]

这可用于确定函数的正确返回类型(此处none 表示void).

This can be used to figure out the right return-type of the function (here none means void).

这篇关于numba 渴望编译?图案是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆