[CUDA]访问传递的内核变量时出现问题 [英] [CUDA] Problem accessing passed kernel variable

查看:72
本文介绍了[CUDA]访问传递的内核变量时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



传递给CUDA内核函数的第二个变量(这里B(float *))给出了垃圾值.:confused:需要建议.

内核功能:



The 2nd variable passed(Here B(float *)) to CUDA Kernel function gives Garbage value.:confused: Need suggestions.

KERNEL Function:

__global__ fmultiply(float* A, float *B, float *C)
{
      int idx = blockIdx.x*blockDim.x + threadIdx.x;
      //B[idx] gives Garbage Value here..      
      C[idx] = A[idx]*B[idx];
}



主文件:




Main File:


int N = 10; //Array Containing Maximum of 10 elements
size_t size = N*sizeof(float);
...
cudaMalloc((**void)&a_d, size);
cudaMalloc((**void)&b_d, size);
cudaMalloc((**void)&c_d, size);

...

cudaMemcpy(a_d, a_h, size, cudaMemcpyHostToDevice);
cudaMemcpy(b_d, b_h, size, cudaMemcpyHostToDevice);

int threadsPerBlock = 256;
int noOfBlocks = (N/threadsPerBlock);

//Calling Kernel Function
fmultiply<<<threadsPerBlock, noOfBlock>>>(a_d, b_d, c_d);

cudaMemcpy(c_d, c_h, size, cudaMemcpyDeviceToHost);
......

cudaFree(a_d);
cudaFree(b_d);
cudaFree(c_d);

推荐答案

请注意b_d
的内容 也在fmultiply(..)调用之前:)

您还可以锁定函数的处理
通过critical section
但那里的B没有任何修改...

...可能您需要在内核中找到其他转换器" ...:)
Please observe the content of b_d
before the call of fmultiply(..) too :)

You could also lock the processing of the function
by a critical section,
but there is no any modification of B there...

...probably you will need to find other "changers" in the kernel... :)


是的,我尝试了这个操作:
Yes i tried this:
__global__ fmultiply(float* A, float *B, float *C)
{      
      int idx = blockIdx.x*blockDim.x + threadIdx.x;      
        
      if(idx<N)
          C[idx] = A[idx]*B[idx];
}



但仍然没有命中.:mad:

是访问内核中变量的正确方法吗?



But still no Hit.:mad:

Is it the correct way to access variables inside Kernel?


这篇关于[CUDA]访问传递的内核变量时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆