在GPU上填充数组 [英] Filling array on GPU

查看:147
本文介绍了在GPU上填充数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在GPU上填充阵列。为了做到这一点,我编写了generateVetor函数

I want to fill my array on GPU. Order to do that i wrote generateVetor function

int rand_from_0_to_100_gen(void) {
    return rand() % 100;
}

__device__ void generateVector(int * hData,int count) {

    for (int i = 0; i < count; i++) {
        hData[i] = rand_from_0_to_100_gen();
    }
}

主要我为数组A动态分配内存

In main i dynamically allocate memory for the array A

int *A = NULL;
err = cudaMalloc((void **) &A, numOfData);

generateVector(A,numOfData);

但是该函数产生以下错误:从 host 函数>设备是不允许的。为什么?我得到什么错误?

But function gives the error that: Calling a host function from device is not allowed. Why? What i get this error?

推荐答案

您至少遇到3个问题:


  1. __ device __ 表示可从GPU代码而非主机代码调用的函数。但是您正在从主机调用 generateVector()。您只需删除 __ device __ 装饰器即可解决此问题。

  2. 您正在使用 numOfData 作为要分配的数据大小。但是所需的size参数以 bytes 为单位。根据对 generateVector()的调用中 numOfData 的使用情况,您应该使用<$ c $之类的东西c> sizeof(int)* numOfData 获取分配的大小。

  3. 您将传递给 generateVector()指针 A ,但 A 是指向设备内存的指针。您不能在主机代码中直接使用这些指针(除非作为诸如cudaMalloc和cudaMemcpy之类的API函数的参数)。相反,您将需要执行以下操作:

  1. __device__ indicates a function that is callable from GPU code not host code. But you are calling generateVector() from the host. You can fix this simply by removing the __device__ decorator.
  2. You are using numOfData as the size of the data to allocate. But the required size parameter is in bytes. Based on your usage of numOfData in your call to generateVector(), you should be using something like sizeof(int)*numOfData for the size of allocation.
  3. You are passing to generateVector() the pointer A, but A is a pointer that points to device memory. You cannot use these pointers directly in host code (except as parameters to API functions like cudaMalloc and cudaMemcpy). Instead you will need to do something like:

int *A = NULL;
int *h_A = NULL;
h_A = (int *)malloc(numOfData*sizeof(int));
generateVector(h_A, numOfData);
cudaMemcpy(A, h_A, numOfData*sizeof(int), cudaMemcpyHostToDevice);


您可能想阅读更多有关如何在此处。

You may want to read more about how to indicate host and device functions here.

如果您确实想使用设备代码中的 generateVector()(在程序中的其他位置)那么您将面临另一个问题,因为无法从设备代码调用 stdlib.h 中的 rand()函数。但是,这似乎并不是您的意图。

If you actually do want to use generateVector() from device code (somewhere else in your program) then you will have an additional problem in that the rand() function from stdlib.h is not callable from device code. This does not seem to be your intent, however.

这篇关于在GPU上填充数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆