在 CUDA 中在设备内存上分配二维数组 [英] Allocate 2D Array on Device Memory in CUDA

查看:37
本文介绍了在 CUDA 中在设备内存上分配二维数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何在 Cuda 的设备内存中分配和传输(到主机和从主机)二维数组?

How do I allocate and transfer(to and from Host) 2D arrays in device memory in Cuda?

推荐答案

我找到了这个问题的解决方案.我不必展平阵列.

I found a solution to this problem. I didn't have to flatten the array.

内置的 cudaMallocPitch() 函数完成了这项工作.我可以使用 cudaMemcpy2D() 函数将数组传入和传出设备.

The inbuilt cudaMallocPitch() function did the job. And I could transfer the array to and from device using cudaMemcpy2D() function.

例如

cudaMallocPitch((void**) &array, &pitch, a*sizeof(float), b);

这将创建一个大小为 a*b 的二维数组,其间距作为参数传入.

This creates a 2D array of size a*b with the pitch as passed in as parameter.

以下代码创建一个二维数组并循环遍历元素.它很容易编译,你可以使用它.

The following code creates a 2D array and loops over the elements. It compiles readily, you may use it.

#include<stdio.h>
#include<cuda.h>
#define height 50
#define width 50

// Device code
__global__ void kernel(float* devPtr, int pitch)
{
    for (int r = 0; r < height; ++r) {
        float* row = (float*)((char*)devPtr + r * pitch);
        for (int c = 0; c < width; ++c) {
             float element = row[c];
        }
    }
}

//Host Code
int main()
{

float* devPtr;
size_t pitch;
cudaMallocPitch((void**)&devPtr, &pitch, width * sizeof(float), height);
kernel<<<100, 512>>>(devPtr, pitch);
return 0;
}

这篇关于在 CUDA 中在设备内存上分配二维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆