在GPU上计算时会得到错误的结果(python3.5 + numba + CUDA8.0) [英] get wrong result when caculating on GPU (python3.5+numba+CUDA8.0)

查看:79
本文介绍了在GPU上计算时会得到错误的结果(python3.5 + numba + CUDA8.0)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获取数组不同部分的总和.我运行我的代码.并从印刷品中发现两个问题.

I want to get the sum of different parts of an array. I run my code. and find two problems from what was printed.

pro1:

此处中详细描述.已经解决了.也许这不是一个真正的问题.

Described in detail here. It has been solved. Maybe it's not a real problem.

pro2:

在我的代码中,我给sbuf [0,2],sbuf [1,2],sbuf [2,2]和sbuf [0,3],sbuf [1,3],sbuf [2,3].

In my code, I gived different value to sbuf[0,2], sbuf[1,2], sbuf[2,2] and sbuf[0,3], sbuf[1,3], sbuf[2,3].

但是发现在 cuda.syncthreads()之后,sbuf [0,2]和sbuf [0,3],sbuf [1,2]和sbuf [1,3],sbuf [2,2]和sbuf [2,3].

But find that after cuda.syncthreads(), the values bacame same between sbuf[0,2] and sbuf[0,3], sbuf[1,2] and sbuf[1,3], sbuf[2,2] and sbuf[2,3].

直接导致Xi_s,Xi1_s和Yi_s的值错误.

It directly lead to the values of Xi_s, Xi1_s and Yi_s wrong.

这些是我根据内核中打印的内容所做的猜测.

These are my guesses according to what was printed inside the kernel.

@talonmies表示,像这样在内核中依赖print语句是很危险的.

@talonmies said relying on print statements inside kernels like this is dangerous.

所以我想知道它是否具有调试我的代码而不是在内核中打印语句的有用方法.

So I want to know if it has an useful way to debug my code instead of printing statements inside kernels.

    ...

@cuda.jit
def calcu_T(D, T):
  ...

                    if bx==1 and tx==1:
                        print('5,c_x,c_y,L,c_index,bx,tx,ty,sbuf[0,ty],sbuf[1,ty],sbuf[2,ty],',c_x,',',c_y,',',L,',',c_index,',',bx,',',tx,',',ty,',',sbuf[0,ty],',',sbuf[1,ty],',',sbuf[2,ty])

                    cuda.syncthreads()

                    if bx==1 and tx==1:
                        print('1,c_x,c_y,L,c_index,bx,tx,ty,sbuf[0,ty],sbuf[1,ty],sbuf[2,ty],',c_x,',',c_y,',',L,',',c_index,',',bx,',',tx,',',ty,',',sbuf[0,ty],',',sbuf[1,ty],',',sbuf[2,ty])

                     ...

推荐答案

正如@talonmies所说,在内核中打印语句不是调试的好选择.如果有人遇到相同的问题,则本文档 pdb 尤其有用,例如调试器命令,例如为"p","c".

As @talonmies said, printing statements inside kernels is not a good choice for debugging. If someone has the same problem, this documentation is helpful, and more you should learn is pdb, especially the debugger commands,such as 'p', 'c'.

这篇关于在GPU上计算时会得到错误的结果(python3.5 + numba + CUDA8.0)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆