在numba.jit装饰中使用并行选项会使函数给出错误的结果 [英] Usage of parallel option in numba.jit decoratior makes function give wrong result
问题描述
给出矩形(x1, y1)
和(x2, y2)
的两个相对角以及两个半径r1
和r2
,求出半径r1
和r2
定义的圆之间的点与矩形中的总点数.
Given two opposite corners of a rectangle (x1, y1)
and (x2, y2)
and two radii r1
and r2
, find the ratio of points that lie between the circles defined by the radii r1
and r2
to the total number of points in the rectangle.
简单的NumPy方法:
Simple NumPy approach:
def func_1(x1,y1,x2,y2,r1,r2,n):
x11,y11 = np.meshgrid(np.linspace(x1,x2,n),np.linspace(y1,y2,n))
z1 = np.sqrt(x11**2+y11**2)
a = np.where((z1>(r1)) & (z1<(r2)))
fill_factor = len(a[0])/(n*n)
return fill_factor
接下来,我尝试使用numba的jit
装饰器优化此功能.当我使用时:
Next I tried to optimize this function with the jit
decorator from numba. When I use:
nopython = True
该功能更快,并且输出正确.但是当我还添加:
The function is faster and gives the right output. But when I also add:
parallel = True
该功能更快,但给出了错误的结果.
我知道这与我的z
矩阵有关,因为该矩阵未正确更新.
The function is faster but gives the wrong result.
I know that this has something to do with my z
matrix since that is not being updated properly.
@jit(nopython=True,parallel=True)
def func_2(x1,y1,x2,y2,r1,r2,n):
x_ = np.linspace(x1,x2,n)
y_ = np.linspace(y1,y2,n)
z1 = np.zeros((n,n))
for i in range(n):
for j in range(n):
z1[i][j] = np.sqrt((x_[i]*x_[i]+y_[j]*y_[j]))
a = np.where((z1>(r1)) & (z1<(r2)))
fill_factor = len(a[0])/(n*n)
return fill_factor
测试值:
x1 = 1.0
x2 = -1.0
y1 = 1.0
y2 = -1.0
r1 = 0.5
r2 = 0.75
n = 25000
其他信息:Python版本:3.6.1,Numba版本:0.34.0 + 5.g1762237,NumPy版本:1.13.1
Additional info : Python version : 3.6.1, Numba version : 0.34.0+5.g1762237, NumPy version : 1.13.1
推荐答案
parallel=True
的问题在于它是一个黑匣子. Numba甚至不保证它会真正并行化任何东西.它使用试探法来确定它是否可并行化以及可以并行执行什么 .这些可能会失败,并且在您的示例中它们确实会失败,就像在我对parallel
和numba 进行的实验中一样.这使得parallel
不可信,我建议使用反对!
The problem with parallel=True
is that it's a black-box. Numba doesn't even guarantee that it will actually parallelize anything. It uses heuristics to find out if it's parallelizable and what could be done in parallel. These can fail and in your example they do fail, just like in my experiments with parallel
and numba. That makes parallel
untrustworthy and I would advise against using it!
在较新的版本(0.34)中添加了prange
,这样您可能会遇到更多的运气.在这种情况下不能使用它,因为prange
的工作方式与range
相似,并且与np.linspace
...
In newer versions (0.34) prange
was added an you could have more luck with that. It can't be applied in this case because prange
works like range
and that's different from np.linspace
...
仅需注意:您可以避免完全构建z
并完全在功能中执行np.where
,您可以明确地进行检查:
Just a note: You can avoid building z
and doing the np.where
in your function completely, you could just do the checks explicitly:
import numpy as np
import numba as nb
@nb.njit # equivalent to "jit(nopython=True)".
def func_2(x1,y1,x2,y2,r1,r2,n):
x_ = np.linspace(x1,x2,n)
y_ = np.linspace(y1,y2,n)
cnts = 0
for i in range(n):
for j in range(n):
z = np.sqrt(x_[i] * x_[i] + y_[j] * y_[j])
if r1 < z < r2:
cnts += 1
fill_factor = cnts/(n*n)
return fill_factor
与您的功能相比,这还应该提供一定的加速,甚至比使用parallel=True
还要快(如果它可以正常工作).
That should also provide some speedup compared to your function, maybe even more than using parallel=True
(if it would work correctly).
这篇关于在numba.jit装饰中使用并行选项会使函数给出错误的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!