在numba.jit装饰中使用并行选项会使函数给出错误的结果 [英] Usage of parallel option in numba.jit decoratior makes function give wrong result

查看:272
本文介绍了在numba.jit装饰中使用并行选项会使函数给出错误的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出矩形(x1, y1)(x2, y2)的两个相对角以及两个半径r1r2,求出半径r1r2定义的圆之间的点与矩形中的总点数.

Given two opposite corners of a rectangle (x1, y1) and (x2, y2) and two radii r1 and r2, find the ratio of points that lie between the circles defined by the radii r1 and r2 to the total number of points in the rectangle.

简单的NumPy方法:

Simple NumPy approach:

def func_1(x1,y1,x2,y2,r1,r2,n):
     x11,y11 = np.meshgrid(np.linspace(x1,x2,n),np.linspace(y1,y2,n))
     z1 = np.sqrt(x11**2+y11**2)
     a = np.where((z1>(r1)) & (z1<(r2)))
     fill_factor = len(a[0])/(n*n)
     return fill_factor

接下来,我尝试使用numba的jit装饰器优化此功能.当我使用时:

Next I tried to optimize this function with the jit decorator from numba. When I use:

nopython = True

该功能更快,并且输出正确.但是当我还添加:

The function is faster and gives the right output. But when I also add:

parallel = True

该功能更快,但给出了错误的结果. 我知道这与我的z矩阵有关,因为该矩阵未正确更新.

The function is faster but gives the wrong result. I know that this has something to do with my z matrix since that is not being updated properly.

@jit(nopython=True,parallel=True)
def func_2(x1,y1,x2,y2,r1,r2,n):
    x_ = np.linspace(x1,x2,n)
    y_ = np.linspace(y1,y2,n)
    z1 = np.zeros((n,n))
    for i in range(n):
        for j in range(n):
            z1[i][j] = np.sqrt((x_[i]*x_[i]+y_[j]*y_[j]))
    a = np.where((z1>(r1)) & (z1<(r2)))
    fill_factor = len(a[0])/(n*n)
    return fill_factor

测试值:

x1 = 1.0
x2 = -1.0
y1 = 1.0
y2 = -1.0
r1 = 0.5
r2 = 0.75
n = 25000

其他信息:Python版本:3.6.1,Numba版本:0.34.0 + 5.g1762237,NumPy版本:1.13.1

Additional info : Python version : 3.6.1, Numba version : 0.34.0+5.g1762237, NumPy version : 1.13.1

推荐答案

parallel=True的问题在于它是一个黑匣子. Numba甚至不保证它会真正并行化任何东西.它使用试探法来确定它是否可并行化以及可以并行执行什么 .这些可能会失败,并且在您的示例中它们确实会失败,就像在我对parallel和numba 进行的实验中一样.这使得parallel不可信,我建议使用反对

The problem with parallel=True is that it's a black-box. Numba doesn't even guarantee that it will actually parallelize anything. It uses heuristics to find out if it's parallelizable and what could be done in parallel. These can fail and in your example they do fail, just like in my experiments with parallel and numba. That makes parallel untrustworthy and I would advise against using it!

在较新的版本(0.34)中添加了prange,这样您可能会遇到更多的运气.在这种情况下不能使用它,因为prange的工作方式与range相似,并且与np.linspace ...

In newer versions (0.34) prange was added an you could have more luck with that. It can't be applied in this case because prange works like range and that's different from np.linspace...

仅需注意:您可以避免完全构建z并完全在功能中执行np.where,您可以明确地进行检查:

Just a note: You can avoid building z and doing the np.where in your function completely, you could just do the checks explicitly:

import numpy as np
import numba as nb

@nb.njit   # equivalent to "jit(nopython=True)".
def func_2(x1,y1,x2,y2,r1,r2,n):
    x_ = np.linspace(x1,x2,n)
    y_ = np.linspace(y1,y2,n)
    cnts = 0
    for i in range(n):
        for j in range(n):
            z = np.sqrt(x_[i] * x_[i] + y_[j] * y_[j])
            if r1 < z < r2:
                cnts += 1
    fill_factor = cnts/(n*n)
    return fill_factor

与您的功能相比,这还应该提供一定的加速,甚至比使用parallel=True还要快(如果它可以正常工作).

That should also provide some speedup compared to your function, maybe even more than using parallel=True (if it would work correctly).

这篇关于在numba.jit装饰中使用并行选项会使函数给出错误的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆