cudamalloc的使用（）。为什么双指针？ [英] Use of cudamalloc(). Why the double pointer?

查看：403 发布时间：2016/8/18 12:51:01 c cuda malloc

本文介绍了cudamalloc的使用（）。为什么双指针？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在经历上<一的实例教程href=\"http://$c$c.google.com/p/stanford-cs193g-sp2010/\">http://$c$c.google.com/p/stanford-cs193g-sp2010/学习CUDA。在code这演示功能 __ __全球功能如下。它简单地创建了两个阵列，在CPU上，一个在GPU上，填充用数字7和拷贝GPU的阵列数据到CPU阵列GPU的阵列。

I am currently going through the tutorial examples on http://code.google.com/p/stanford-cs193g-sp2010/ to learn CUDA. The code which demostrates __global__ functions is given below. It simply creates two arrays, one on the CPU and one on the GPU, populates the GPU array with the number 7 and copies the GPU array data into the CPU array.

#include <stdlib.h>
#include <stdio.h>

__global__ void kernel(int *array)
{
  int index = blockIdx.x * blockDim.x + threadIdx.x;

  array[index] = 7;
}

int main(void)
{
  int num_elements = 256;

  int num_bytes = num_elements * sizeof(int);

  // pointers to host & device arrays
  int *device_array = 0;
  int *host_array = 0;

  // malloc a host array
  host_array = (int*)malloc(num_bytes);

  // cudaMalloc a device array
  cudaMalloc((void**)&device_array, num_bytes);

  int block_size = 128;
  int grid_size = num_elements / block_size;

  kernel<<<grid_size,block_size>>>(device_array);

  // download and inspect the result on the host:
  cudaMemcpy(host_array, device_array, num_bytes, cudaMemcpyDeviceToHost);

  // print out the result element by element
  for(int i=0; i < num_elements; ++i)
  {
    printf("%d ", host_array[i]);
  }

  // deallocate memory
  free(host_array);
  cudaFree(device_array);
}

我的问题是他们为什么措辞 cudaMalloc（（无效**）及device_array，NUM_BYTES）; 带双指针声明？即使<一个href=\"http://www.clear.rice.edu/comp422/resources/cuda/html/group__CUDART__MEMORY_gc63ffd93e344b939d6399199d8b12fef.html\">here cudamalloc关于（）的定义说，第一个参数是一个双指针。

My question is why have they worded the cudaMalloc((void**)&device_array, num_bytes); statement with a double pointer? Even here definition of cudamalloc() on says the first argument is a double pointer.

为什么不直接返回一个指针在GPU上分配内存的开始，就像的malloc 函数只有在CPU上？

Why not simply return a pointer to the beginning of the allocated memory on the GPU, just like the malloc function does on the CPU?

推荐答案

所有CUDA API函数返回一个错误code（或cudaSuccess如果没有出现错误）。所有其他参数按引用传递。然而，在普通的C，你不能有引用，这就是为什么你必须通过你要存储的返回信息变量的地址。既然你是返回一个指针，你需要通过一个双指针。

All CUDA API functions return an error code (or cudaSuccess if no error occured). All other parameters are passed by reference. However, in plain C you cannot have references, that's why you have to pass an address of the variable that you want the return information to be stored. Since you are returning a pointer, you need to pass a double-pointer.

这对同样的原因地址经营另一个众所周知的功能是 scanf函数功能。有多少次你忘了写这个＆安培; 要存储值的变量之前？ ;）

Another well-known function which operates on addresses for the same reason is the scanf function. How many times have you forgotten to write this & before the variable that you want to store the value to? ;)

int i;
scanf("%d",&i);

这篇关于cudamalloc的使用（）。为什么双指针？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

cudamalloc的使用（）。为什么双指针？ [英] Use of cudamalloc(). Why the double pointer?

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录关闭

cudamalloc的使用（）。为什么双指针？ [英] Use of cudamalloc(). Why the double pointer?

问题描述

推荐答案

相关文章

C/C++最新文章

热门教程

热门工具

登录 关闭

登录关闭