Python中的Numba Jit警告解释 [英] Numba jit warnings interpretation in python
问题描述
我已经定义了以下递归数组生成器,并正在使用Numba jit尝试加快处理速度(基于 我收到了一堆我不太明白的警告/错误.希望能帮助您解释它们并使它们消失,以便(我假设)进一步加快计算速度. 它们在下面: Numba警告:
由于功能"calc_func"由于以下原因而导致类型推断失败,因此编译已返回到对象模式:无效使用了带有参数类型的Function():(int64,dtype = Literalstr)
*参数化 在定义0中:
所有模板均不使用文字. 在定义1中:
拒绝所有不带文字的模板.
通常,此错误是由于传递了命名函数不支持的类型的参数而引起的. [1]在:解决被叫者类型:Function() [2]在:在 文件"thenameofmyscript.py",第71行:
@jit("float32:",nopython = False,nogil = True) myscript.py:69的名称:Numba警告:
由于功能"calc_func"由于以下原因导致类型推断失败,因此编译返回到对象模式而没有启用循环提升:无法确定 文件"thenameofmyscript.py",第73行:
@jit("float32:",nopython = False,nogil = True) H:\ projects \ decay-optimizer \ venv \ lib \ site-packages \ numba \ compiler.py:742:Numba警告:函数"calc_func"是在对象模式下编译的,没有forceobj = True,但循环已取消. /p>
文件"thenameofmyscript.py",第70行:
self.func_ir.loc)) H:\ projects \ decay-optimizer \ venv \ lib \ site-packages \ numba \ compiler.py:751:NumbaDeprecation警告:
已检测到从nopython编译路径回退到对象模式编译路径的行为,这已被弃用. 文件"thenameofmyscript.py",第70行:
warnings.warn(errors.NumbaDeprecationWarning(msg,self.func_ir.loc)) thenameofmyscript.py:69:Numba警告:尽管nogil = True,在对象模式下运行的代码仍不允许并行执行.
@jit("float32:",nopython = False,nogil = True)
现代CPU的加,减和乘运算非常快.尽可能避免进行求幂等操作. 示例 在此示例中,我通过简单的乘法替换了昂贵的幂运算.诸如此类的简化可能会导致很高的加速比,但也可能会改变结果. 首先,您的实现(float64)没有任何签名,稍后将在另一个简单示例中进行处理. 一个好主意是尽可能使用标量. 时间
在提前"模式(AOT)中,签名是必需的,但在通常的JIT模式下则不需要.上面的示例不是SIMD可矢量化的.因此,您不会看到输入和输出的可能不是最佳声明的正面或负面影响.
让我们看另一个例子. 为什么带有签名的版本较慢? 让我们仔细看看签名. 如果在编译时内存布局未知,则通常无法对算法进行SIMD矢量化.当然,您可以显式声明C连续数组,但是该函数将不再适用于非连续输入,这通常是不希望的. I have defined the following recursive array generator and am using Numba jit to try and accelerate the processing (based on this SO answer) I am getting a bunch of warnings/errors that I do not quite get. Would appreciate help in explaining them and making them disappear in order to (I'm assuming) speed up the calculation even more. Here they are below : NumbaWarning:
Compilation is falling back to object mode WITH looplifting enabled because Function "calc_func" failed type inference due to: Invalid use of Function() with argument(s) of type(s): (int64, dtype=Literalstr)
* parameterized In definition 0:
All templates rejected with literals. In definition 1:
All templates rejected without literals.
This error is usually caused by passing an argument of a type that is unsupported by the named function. [1] During: resolving callee type: Function() [2] During: typing of call at File "thenameofmyscript.py", line 71:
@jit("float32:", nopython=False, nogil=True) thenameofmyscript.py:69: NumbaWarning:
Compilation is falling back to object mode WITHOUT looplifting enabled because Function "calc_func" failed type inference due to: cannot determine Numba type of File "thenameofmyscript.py", line 73:
@jit("float32:", nopython=False, nogil=True) H:\projects\decay-optimizer\venv\lib\site-packages\numba\compiler.py:742: NumbaWarning: Function "calc_func" was compiled in object mode without forceobj=True, but has lifted loops. File "thenameofmyscript.py", line 70:
self.func_ir.loc)) H:\projects\decay-optimizer\venv\lib\site-packages\numba\compiler.py:751: NumbaDeprecationWarning:
Fall-back from the nopython compilation path to the object mode compilation path has been detected, this is deprecated behaviour. File "thenameofmyscript.py", line 70:
warnings.warn(errors.NumbaDeprecationWarning(msg, self.func_ir.loc)) thenameofmyscript.py:69: NumbaWarning: Code running in object mode won't allow parallel execution despite nogil=True.
@jit("float32:", nopython=False, nogil=True)
Modern CPUs are quite fast at additions, subtractions and multiplications. Operations like exponentiation, should be avoided when possible. Example In this example I replaced the costly exponentiation by a simple multiplication. Simplifications like that can lead to quite high speedups, but also may change the result. At first your implementation (float64) without any signatures, I will treat this later on another simple example. Also a good idea is to use scalars where possible. Timings
In Ahead of time mode (AOT) signatures are necessary, but not in the usual JIT mode. The example above is not SIMD- vectorizable. So you won't see much positive nor negative effects of a possibly not optimal declaration of in- and outputs.
Let's look at another example. Why is the version with signatures slower? Let's have a closer look on the signatures. If the memory layout is unknown at compile time, it is often impossible to SIMD- vectorize the algorithm. Of course you can explicitly declare C-contigous arrays, but the function wont work anymore for non contigous inputs, which is normally not intended. 这篇关于Python中的Numba Jit警告解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!@jit("float32[:](float32,float32,intp)", nopython=False, nogil=True)
def calc_func(a, b, n):
res = np.empty(n, dtype="float32")
res[0] = 0
for i in range(1, n):
res[i] = a * res[i - 1] + (1 - a) * (b ** (i - 1))
return res
a = calc_func(0.988, 0.9988, 5000)
res = np.empty(n, dtype="float32")
def calc_func(a, b, n):
res = np.empty(n, dtype="float32")
^
<class 'numba.dispatcher.LiftedLoop'>
def calc_func(a, b, n):
<source elided>
res[0] = 0
for i in range(1, n):
^
@jit("float32[:](float32,float32,intp)", nopython=False, nogil=True)
def calc_func(a, b, n):
^
@jit("float32[:](float32,float32,intp)", nopython=False, nogil=True)
def calc_func(a, b, n):
^
1.优化功能(代数简化)
#nb.jit(nopython=True) is a shortcut for @nb.njit()
@nb.njit()
def calc_func_opt_1(a, b, n):
res = np.empty(n, dtype=np.float64)
fact=b
res[0] = 0.
res[1] = a * res[0] + (1. - a) *1.
res[2] = a * res[1] + (1. - a) * fact
for i in range(3, n):
fact*=b
res[i] = a * res[i - 1] + (1. - a) * fact
return res
@nb.njit()
def calc_func_opt_2(a, b, n):
res = np.empty(n, dtype=np.float64)
fact_1=b
fact_2=0.
res[0] = fact_2
fact_2=a * fact_2 + (1. - a) *1.
res[1] = fact_2
fact_2 = a * fact_2 + (1. - a) * fact_1
res[2]=fact_2
for i in range(3, n):
fact_1*=b
fact_2= a * fact_2 + (1. - a) * fact_1
res[i] = fact_2
return res
%timeit a = calc_func(0.988, 0.9988, 5000)
222 µs ± 2.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit a = calc_func_opt_1(0.988, 0.9988, 5000)
22.7 µs ± 45.5 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit a = calc_func_opt_2(0.988, 0.9988, 5000)
15.3 µs ± 35.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
2.签名值得推荐吗?
#Numba is able to SIMD-vectorize this loop if
#a,b,res are contigous arrays
@nb.njit(fastmath=True)
def some_function_1(a,b):
res=np.empty_like(a)
for i in range(a.shape[0]):
res[i]=a[i]**2+b[i]**2
return res
@nb.njit("float64[:](float64[:],float64[:])",fastmath=True)
def some_function_2(a,b):
res=np.empty_like(a)
for i in range(a.shape[0]):
res[i]=a[i]**2+b[i]**2
return res
a=np.random.rand(10_000)
b=np.random.rand(10_000)
#Example for non contiguous input
#a=np.random.rand(10_000)[0::2]
#b=np.random.rand(10_000)[0::2]
%timeit res=some_function_1(a,b)
5.59 µs ± 36.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit res=some_function_2(a,b)
9.36 µs ± 47.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
some_function_1.nopython_signatures
#[(array(float64, 1d, C), array(float64, 1d, C)) -> array(float64, 1d, C)]
some_function_2.nopython_signatures
#[(array(float64, 1d, A), array(float64, 1d, A)) -> array(float64, 1d, A)]
#this is equivivalent to
#"float64[::1](float64[::1],float64[::1])"
@jit("float32[:](float32,float32,intp)", nopython=False, nogil=True)
def calc_func(a, b, n):
res = np.empty(n, dtype="float32")
res[0] = 0
for i in range(1, n):
res[i] = a * res[i - 1] + (1 - a) * (b ** (i - 1))
return res
a = calc_func(0.988, 0.9988, 5000)
res = np.empty(n, dtype="float32")
def calc_func(a, b, n):
res = np.empty(n, dtype="float32")
^
<class 'numba.dispatcher.LiftedLoop'>
def calc_func(a, b, n):
<source elided>
res[0] = 0
for i in range(1, n):
^
@jit("float32[:](float32,float32,intp)", nopython=False, nogil=True)
def calc_func(a, b, n):
^
@jit("float32[:](float32,float32,intp)", nopython=False, nogil=True)
def calc_func(a, b, n):
^
1. Optimize the function (algebraic simplification)
#nb.jit(nopython=True) is a shortcut for @nb.njit()
@nb.njit()
def calc_func_opt_1(a, b, n):
res = np.empty(n, dtype=np.float64)
fact=b
res[0] = 0.
res[1] = a * res[0] + (1. - a) *1.
res[2] = a * res[1] + (1. - a) * fact
for i in range(3, n):
fact*=b
res[i] = a * res[i - 1] + (1. - a) * fact
return res
@nb.njit()
def calc_func_opt_2(a, b, n):
res = np.empty(n, dtype=np.float64)
fact_1=b
fact_2=0.
res[0] = fact_2
fact_2=a * fact_2 + (1. - a) *1.
res[1] = fact_2
fact_2 = a * fact_2 + (1. - a) * fact_1
res[2]=fact_2
for i in range(3, n):
fact_1*=b
fact_2= a * fact_2 + (1. - a) * fact_1
res[i] = fact_2
return res
%timeit a = calc_func(0.988, 0.9988, 5000)
222 µs ± 2.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit a = calc_func_opt_1(0.988, 0.9988, 5000)
22.7 µs ± 45.5 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit a = calc_func_opt_2(0.988, 0.9988, 5000)
15.3 µs ± 35.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
2. Are signatures recommendable?
#Numba is able to SIMD-vectorize this loop if
#a,b,res are contigous arrays
@nb.njit(fastmath=True)
def some_function_1(a,b):
res=np.empty_like(a)
for i in range(a.shape[0]):
res[i]=a[i]**2+b[i]**2
return res
@nb.njit("float64[:](float64[:],float64[:])",fastmath=True)
def some_function_2(a,b):
res=np.empty_like(a)
for i in range(a.shape[0]):
res[i]=a[i]**2+b[i]**2
return res
a=np.random.rand(10_000)
b=np.random.rand(10_000)
#Example for non contiguous input
#a=np.random.rand(10_000)[0::2]
#b=np.random.rand(10_000)[0::2]
%timeit res=some_function_1(a,b)
5.59 µs ± 36.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit res=some_function_2(a,b)
9.36 µs ± 47.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
some_function_1.nopython_signatures
#[(array(float64, 1d, C), array(float64, 1d, C)) -> array(float64, 1d, C)]
some_function_2.nopython_signatures
#[(array(float64, 1d, A), array(float64, 1d, A)) -> array(float64, 1d, A)]
#this is equivivalent to
#"float64[::1](float64[::1],float64[::1])"