CUDA不支持边界检查 [英] Bounds checking is not supported for CUDA

查看:86
本文介绍了CUDA不支持边界检查的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图使用Numba并访问GPU以加速代码,但是出现以下错误:

I am trying to use Numba and access the GPU in order to accelerate the code, but I get the following error:

in jit raise NotImplementedError("bounds checking is not supported for CUDA")
NotImplementedError: bounds checking is not supported for CUDA

我看到有人提出了另一个问题,但没有完全指出或回答这里. 当我看到矢量化代码(y = corr*x + np.sqrt(1.-corr**2)*z)不起作用(相同错误)时,我实现了2-for循环.我也尝试使用选项boundscheck,但这并没有改变结果. 未指定target时未出现错误,因为它会自动出现在CPU上(我想).

I saw that another question was raised, but not completely specified nor answered here. I implemented the 2-for loops when I saw that the vectorized code (y = corr*x + np.sqrt(1.-corr**2)*z) did not work (same error). I also tried to play around with the option boundscheck, but this did not change the outcome. The error did not appear when not specifying the target, since it goes on the CPU automatically (I guess).

import numpy as np
from numba import jit

N = int(1e8)
@jit(nopython=True, target='cuda', boundscheck=False)
def Brownian_motions(T, N, corr):
    x = np.random.normal(0, 1, size=(T,N))
    z = np.random.normal(0, 1, size=(T,N))
    y = np.zeros(shape=(T,N))
    for i in range(T):
        for j in range(N):
            y[i,j] = corr*x[i,j] + np.sqrt(1.-corr**2)*z[i,j]
    return(x,y)

x, y = Brownian_motions(T = 500, N = N, corr = -0.45)

能请你帮我吗? Python是3.7.6,Numba是0.48.0.

Could you please help me? Python is 3.7.6 and Numba is 0.48.0.

推荐答案

在我的情况下,我还替换了 @jit ,它是装饰器,可以使用XLA编译多个操作.这是一个示例代码,用于查看CPU和GPU的性能.

In my case I also replaced with @jit which is decorator to compile the multiple operations using XLA. Here is an example code to see the performance of CPU and GPU.

from numba import jit
import numpy as np 
# to measure exec time 
from timeit import default_timer as timer    

# normal function to run on cpu 
def func(a):                                 
    for i in range(10000000): 
        a[i]+= 1      

# function optimized to run on gpu  
@jit
#(target ="cuda")                          
def func2(a): 
    for i in range(10000000): 
        a[i]+= 1
if __name__=="__main__": 
    n = 10000000                            
    a = np.ones(n, dtype = np.float64) 
    b = np.ones(n, dtype = np.float32) 

    start = timer() 
    func(a) 
    print("without GPU:", timer()-start)     

    start = timer() 
    func2(a) 
    print("with GPU:", timer()-start) 

结果: 没有GPU:5.353004818000045 使用GPU:0.23115529000006063

Result: without GPU: 5.353004818000045 with GPU: 0.23115529000006063

这篇关于CUDA不支持边界检查的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆