使用 cudamalloc().为什么是双指针? [英] Use of cudamalloc(). Why the double pointer?
问题描述
我目前正在浏览 http://code.google.com/p/stanford 上的教程示例-cs193g-sp2010/ 来学习 CUDA.下面给出了演示 __global__
函数的代码.它只是创建了两个数组,一个在 CPU 上,一个在 GPU 上,用数字 7 填充 GPU 数组并将 GPU 数组数据复制到 CPU 数组中.
I am currently going through the tutorial examples on http://code.google.com/p/stanford-cs193g-sp2010/ to learn CUDA. The code which demostrates __global__
functions is given below. It simply creates two arrays, one on the CPU and one on the GPU, populates the GPU array with the number 7 and copies the GPU array data into the CPU array.
#include <stdlib.h>
#include <stdio.h>
__global__ void kernel(int *array)
{
int index = blockIdx.x * blockDim.x + threadIdx.x;
array[index] = 7;
}
int main(void)
{
int num_elements = 256;
int num_bytes = num_elements * sizeof(int);
// pointers to host & device arrays
int *device_array = 0;
int *host_array = 0;
// malloc a host array
host_array = (int*)malloc(num_bytes);
// cudaMalloc a device array
cudaMalloc((void**)&device_array, num_bytes);
int block_size = 128;
int grid_size = num_elements / block_size;
kernel<<<grid_size,block_size>>>(device_array);
// download and inspect the result on the host:
cudaMemcpy(host_array, device_array, num_bytes, cudaMemcpyDeviceToHost);
// print out the result element by element
for(int i=0; i < num_elements; ++i)
{
printf("%d ", host_array[i]);
}
// deallocate memory
free(host_array);
cudaFree(device_array);
}
我的问题是为什么他们用双指针来编写 cudaMalloc((void**)&device_array, num_bytes);
语句?甚至 here 定义的 cudamalloc() on 说第一个参数是双指针.
My question is why have they worded the cudaMalloc((void**)&device_array, num_bytes);
statement with a double pointer? Even here definition of cudamalloc() on says the first argument is a double pointer.
为什么不像 CPU 上的 malloc
函数那样简单地返回一个指向 GPU 上分配内存开头的指针?
Why not simply return a pointer to the beginning of the allocated memory on the GPU, just like the malloc
function does on the CPU?
推荐答案
所有 CUDA API 函数都返回一个错误代码(如果没有发生错误,则返回 cudaSuccess).所有其他参数都通过引用传递.但是,在纯 C 中,您不能有引用,这就是为什么您必须传递您希望存储返回信息的变量的地址.由于要返回指针,因此需要传递双指针.
All CUDA API functions return an error code (or cudaSuccess if no error occured). All other parameters are passed by reference. However, in plain C you cannot have references, that's why you have to pass an address of the variable that you want the return information to be stored. Since you are returning a pointer, you need to pass a double-pointer.
另一个众所周知的基于同样原因对地址进行操作的函数是 scanf
函数.您有多少次忘记在要存储值的变量之前写此 &
?;)
Another well-known function which operates on addresses for the same reason is the scanf
function. How many times have you forgotten to write this &
before the variable that you want to store the value to? ;)
int i;
scanf("%d",&i);
这篇关于使用 cudamalloc().为什么是双指针?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!