[CUDA]访问传递的内核变量时出现问题 [英] [CUDA] Problem accessing passed kernel variable
本文介绍了[CUDA]访问传递的内核变量时出现问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
传递给CUDA内核函数的第二个变量(这里B(float *))给出了垃圾值.:confused:需要建议.
内核功能:
The 2nd variable passed(Here B(float *)) to CUDA Kernel function gives Garbage value.:confused: Need suggestions.
KERNEL Function:
__global__ fmultiply(float* A, float *B, float *C)
{
int idx = blockIdx.x*blockDim.x + threadIdx.x;
//B[idx] gives Garbage Value here..
C[idx] = A[idx]*B[idx];
}
主文件:
Main File:
int N = 10; //Array Containing Maximum of 10 elements
size_t size = N*sizeof(float);
...
cudaMalloc((**void)&a_d, size);
cudaMalloc((**void)&b_d, size);
cudaMalloc((**void)&c_d, size);
...
cudaMemcpy(a_d, a_h, size, cudaMemcpyHostToDevice);
cudaMemcpy(b_d, b_h, size, cudaMemcpyHostToDevice);
int threadsPerBlock = 256;
int noOfBlocks = (N/threadsPerBlock);
//Calling Kernel Function
fmultiply<<<threadsPerBlock, noOfBlock>>>(a_d, b_d, c_d);
cudaMemcpy(c_d, c_h, size, cudaMemcpyDeviceToHost);
......
cudaFree(a_d);
cudaFree(b_d);
cudaFree(c_d);
推荐答案
请注意b_d
的内容 也在fmultiply(..)
调用之前:)
您还可以锁定函数的处理
通过critical section
,
但那里的B
没有任何修改...
...可能您需要在内核中找到其他转换器" ...:)
Please observe the content ofb_d
before the call offmultiply(..)
too :)
You could also lock the processing of the function
by acritical section
,
but there is no any modification ofB
there...
...probably you will need to find other "changers" in the kernel... :)
是的,我尝试了这个操作:
Yes i tried this:
__global__ fmultiply(float* A, float *B, float *C)
{
int idx = blockIdx.x*blockDim.x + threadIdx.x;
if(idx<N)
C[idx] = A[idx]*B[idx];
}
但仍然没有命中.:mad:
是访问内核中变量的正确方法吗?
But still no Hit.:mad:
Is it the correct way to access variables inside Kernel?
这篇关于[CUDA]访问传递的内核变量时出现问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文