在GPU上填充数组 [英] Filling array on GPU
问题描述
我想在GPU上填充阵列。为了做到这一点,我编写了generateVetor函数
I want to fill my array on GPU. Order to do that i wrote generateVetor function
int rand_from_0_to_100_gen(void) {
return rand() % 100;
}
__device__ void generateVector(int * hData,int count) {
for (int i = 0; i < count; i++) {
hData[i] = rand_from_0_to_100_gen();
}
}
主要我为数组A动态分配内存
In main i dynamically allocate memory for the array A
int *A = NULL;
err = cudaMalloc((void **) &A, numOfData);
generateVector(A,numOfData);
但是该函数产生以下错误:从 host 函数>设备是不允许的。为什么?我得到什么错误?
But function gives the error that: Calling a host function from device is not allowed. Why? What i get this error?
推荐答案
您至少遇到3个问题:
-
__ device __
表示可从GPU代码而非主机代码调用的函数。但是您正在从主机调用generateVector()
。您只需删除__ device __
装饰器即可解决此问题。 - 您正在使用
numOfData
作为要分配的数据大小。但是所需的size参数以 bytes 为单位。根据对generateVector()
的调用中numOfData
的使用情况,您应该使用<$ c $之类的东西c> sizeof(int)* numOfData 获取分配的大小。 -
您将传递给
generateVector()
指针A
,但A
是指向设备内存的指针。您不能在主机代码中直接使用这些指针(除非作为诸如cudaMalloc和cudaMemcpy之类的API函数的参数)。相反,您将需要执行以下操作:
__device__
indicates a function that is callable from GPU code not host code. But you are callinggenerateVector()
from the host. You can fix this simply by removing the__device__
decorator.- You are using
numOfData
as the size of the data to allocate. But the required size parameter is in bytes. Based on your usage ofnumOfData
in your call togenerateVector()
, you should be using something likesizeof(int)*numOfData
for the size of allocation. You are passing to
generateVector()
the pointerA
, butA
is a pointer that points to device memory. You cannot use these pointers directly in host code (except as parameters to API functions like cudaMalloc and cudaMemcpy). Instead you will need to do something like:
int *A = NULL;
int *h_A = NULL;
h_A = (int *)malloc(numOfData*sizeof(int));
generateVector(h_A, numOfData);
cudaMemcpy(A, h_A, numOfData*sizeof(int), cudaMemcpyHostToDevice);
您可能想阅读更多有关如何在此处。
You may want to read more about how to indicate host and device functions here.
如果您确实想使用设备代码中的 generateVector()
(在程序中的其他位置)那么您将面临另一个问题,因为无法从设备代码调用 stdlib.h
中的 rand()
函数。但是,这似乎并不是您的意图。
If you actually do want to use generateVector()
from device code (somewhere else in your program) then you will have an additional problem in that the rand()
function from stdlib.h
is not callable from device code. This does not seem to be your intent, however.
这篇关于在GPU上填充数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!