cudaMemcpy2D将值设置为0 [英] cudaMemcpy2D setting values to 0

查看:197
本文介绍了cudaMemcpy2D将值设置为0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用cudaMallocPitch和cudaMemcpy2D将2维数组从主机复制到设备,但我遇到一个问题,似乎将我的值设置为0.

I'm attempting to copy a 2-dimensional array from host to device with cudaMallocPitch and cudaMemcpy2D, but I'm having a problem where it seems to be setting my value to 0.

我将在浏览器中写我的代码的基础。我知道从内核打印的值不是0.任何想法?

I'll write the basics of my code in the browser. I know the value I print from the kernel is not 0. Any ideas?

__global__ void kernel(float **d_array) {
    printf("%f", d_array[0][0]);
}

void kernelWrapper(int rows, int cols, float **array) {
    float **d_array;
    size_t pitch;
    cudaMallocPitch((void**) &d_array, &pitch, rows*sizeof(float), cols);
    cudaMemcpy2D(d_array, pitch, array, rows*sizeof(float), rows*sizeof(float), cols, cudaMemcpyHostToDevice);
    kernel<<<1,1>>>(d_array);
}

由于某种原因,内核会继续打印0.0000。我知道第一个元素不是0,因为我测试打印主机数组的第一个元素。发生了什么?

For some reason, the kernel keeps printing 0.0000. I know that the first element is not 0 as I tested printing the first element of the host array. What is happening?

编辑:
我也试过这个代码,但是有无效的指针错误。

I tried this code as well but got invalid pointer errors.

cudaMalloc(d_array, rows*sizeof(float*));
for (int i = 0; i < rows; i++) {
    cudaMalloc((void**) &d_array[i], cols*sizeof(float));
}
cudaMemcpy(d_array, array, rows*sizeof(float*), cudaMemcpyHostToDevice);


推荐答案

//docs.nvidia.com/cuda/cuda-runtime-api/index.html#group__CUDART__MEMORY_1g17f3a55e8c9aef5f90b67cdf22851375rel =nofollow> cudaMemcpy2D 不会复制双下标的C主机数组( ** )转换为双下标( ** )设备数组。你会注意到它期望单向指针( * )被传递给它,而不是双指针( ** )。 cudaMemcpy2D 用于复制平面的,跨距的数组,而不是一个二维数组。

Despite it's name, cudaMemcpy2D does not copy a doubly-subscripted C host array (**) to a doubly-subscripted (**) device array. You'll note that it expects single pointers (*) to be passed to it, not double pointers (**). cudaMemcpy2D is used for copying a flat, strided array, not a 2-dimensional array. There are 2 dimensions inherent in the concept of strided access, which is where the name comes from.

通常,尝试将2D数组从主机复制到设备是更多的复杂而不仅仅是一个单一的API调用。建议您展平数组,以便可以使用单个指针( * )引用它,然后API调用将工作。有很多适当使用cudaMemcpy2D在SO上的例子,只需搜索他们。

In general, trying to copy a 2D array from host to device is more complicated than just a single API call. You are advised to flatten your array so you can reference it with a single pointer (*), then the API calls will work. There are plenty of examples of proper usage of cudaMemcpy2D on SO, just search for them.

此外,你应该做 cuda错误检查,每当你有困难与CUDA代码。

Also, you should do cuda error checking on all cuda API calls and kernel calls, whenever you are having difficulty with CUDA code.

如果您真的要直接复制2D数组,请查看这个问题/回答一个工作的例子。这不是微不足道。

If you really want to copy a 2D array directly, take a look at this question/answer for a worked example. It's not trivial.

这篇关于cudaMemcpy2D将值设置为0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆