将类对象传递到内核 [英] Passing a class object to a kernel
问题描述
我们可以将对象传递给内核函数吗?
考虑我有一个类
class MyClass
{
public:
int value;
float rate;
MyClass()
{
value = 0; rate = 0;
}
MyClass(int v,float r)
{
value = v; rate = r;
}
};
,我的Kernel接收MyClass对象的数组
__global__ void MyKernel(MyClass * mc)
{
//一些计算
}
我可以传递数组吗?如何分配内存?现在我试着用下面的代码得到CudaMemcpy错误
cudaError_t cudaStatus;
MyClass darr [10];
cudaStatus = cudaMalloc((void **)& darr,size * sizeof(MyClass));
if(cudaStatus!= cudaSuccess){
fprintf(stderr,cudaMalloc failed!);
goto label1;
}
cudaStatus = cudaMemcpy(darr,arr,size * sizeof(MyClass),cudaMemcpyHostToDevice);
// arr是主机数组
这里有一些问题,不是所有直接与你看到的错误有关。
首先,您必须在主机和设备中定义每个类方法,以便类可以在两个内存空间中实例化(当您执行复制时,每个实例的数据成员被复制)。所以你的类声明应该看起来像这样:
class MyClass
{
public:
int value;
float rate;
__device__ __host__ MyClass()
{
value = 0; rate = 0;
}
__device__ __host__ MyClass(int v,float r)
{
value = v; rate = r;
}
__device__ __host__〜MyClass(){};
}
然后,您需要正确分配设备内存。如果你想在设备上有一个拥有10个成员的 MyClass
数组,请按如下所示分配并复制到设备:
MyClass arr [10];
MyClass * darr;
const size_t sz = size_t(10)* sizeof(MyClass);
cudaMalloc((void **)& darr,sz);
cudaMemcpy(darr,and amp; arr [0],sz,cudaMemcpyHostToDevice);
[免责声明:所有用浏览器编写的代码,不经过编译或测试, p>
然后,您可以将 darr
作为参数传递给内核。
Can we pass an object to a kernel function ?
Consider I have a class
class MyClass
{
public :
int value;
float rate;
MyClass()
{
value = 0; rate = 0;
}
MyClass(int v,float r)
{
value = v; rate = r;
}
};
and my Kernel takes an array of the objects of MyClass
__global__ void MyKernel(MyClass * mc)
{
//Some Calculation
}
Can I pass the array ? How to allocate memory ?? Right now I tried with the following code got CudaMemcpy error
cudaError_t cudaStatus;
MyClass darr[10] ;
cudaStatus = cudaMalloc((void**)&darr, size * sizeof(MyClass));
if (cudaStatus != cudaSuccess) {
fprintf(stderr, "cudaMalloc failed!");
goto label1;
}
cudaStatus = cudaMemcpy(darr, arr, size * sizeof(MyClass), cudaMemcpyHostToDevice);
//arr is a host array
There are a few problems here, not all directly related to whatever error you are seeing.
Firstly, you will have to define each class method in both the host and device so that the class can be instantiated in both memory spaces (when you do a copy, only the data members of each instance are copied). So your class declaration should look something like this:
class MyClass
{
public :
int value;
float rate;
__device__ __host__ MyClass()
{
value = 0; rate = 0;
}
__device__ __host__ MyClass(int v,float r)
{
value = v; rate = r;
}
__device__ __host__ ~MyClass() {};
}
You then need to correctly allocate the device memory. If you want an array of MyClass
with 10 members on the device, allocate and copy it to the device like this:
MyClass arr[10];
MyClass *darr;
const size_t sz = size_t(10) * sizeof(MyClass);
cudaMalloc((void**)&darr, sz);
cudaMemcpy(darr, &arr[0], sz, cudaMemcpyHostToDevice);
[disclaimer: all code written in browser, never complied or tested, use at own risk]
You can then pass darr
as an argument to a kernel from the host.
这篇关于将类对象传递到内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!