将类对象传递到内核 [英] Passing a class object to a kernel

查看:169
本文介绍了将类对象传递到内核的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们可以将对象传递给内核函数吗?



考虑我有一个类

  class MyClass 
{
public:
int value;
float rate;
MyClass()
{
value = 0; rate = 0;
}
MyClass(int v,float r)
{
value = v; rate = r;
}
};

,我的Kernel接收MyClass对象的数组

  __global__ void MyKernel(MyClass * mc)
{
//一些计算
}

我可以传递数组吗?如何分配内存?现在我试着用下面的代码得到CudaMemcpy错误

  cudaError_t cudaStatus; 

MyClass darr [10];
cudaStatus = cudaMalloc((void **)& darr,size * sizeof(MyClass));

if(cudaStatus!= cudaSuccess){
fprintf(stderr,cudaMalloc failed!);
goto label1;
}

cudaStatus = cudaMemcpy(darr,arr,size * sizeof(MyClass),cudaMemcpyHostToDevice);
// arr是主机数组


解决方案

这里有一些问题,不是所有直接与你看到的错误有关。



首先,您必须在主机和设备中定义每个类方法,以便类可以在两个内存空间中实例化(当您执行复制时,每个实例的数据成员被复制)。所以你的类声明应该看起来像这样:

  class MyClass 
{
public:
int value;
float rate;
__device__ __host__ MyClass()
{
value = 0; rate = 0;
}
__device__ __host__ MyClass(int v,float r)
{
value = v; rate = r;
}
__device__ __host__〜MyClass(){};
}

然后,您需要正确分配设备内存。如果你想在设备上有一个拥有10个成员的 MyClass 数组,请按如下所示分配并复制到设备:

  MyClass arr [10]; 
MyClass * darr;
const size_t sz = size_t(10)* sizeof(MyClass);
cudaMalloc((void **)& darr,sz);
cudaMemcpy(darr,and amp; arr [0],sz,cudaMemcpyHostToDevice);

[免责声明:所有用浏览器编写的代码,不经过编译或测试, p>

然后,您可以将 darr 作为参数传递给内核。


Can we pass an object to a kernel function ?

Consider I have a class

   class MyClass
   {
        public :
               int value;
               float rate;
               MyClass()
               {
                   value = 0; rate = 0;
               }
               MyClass(int v,float r)
               {
                   value = v; rate = r;
               }
   };

and my Kernel takes an array of the objects of MyClass

   __global__ void MyKernel(MyClass * mc)
   {
    //Some Calculation
   }

Can I pass the array ? How to allocate memory ?? Right now I tried with the following code got CudaMemcpy error

    cudaError_t cudaStatus;

    MyClass darr[10] ;
    cudaStatus = cudaMalloc((void**)&darr, size * sizeof(MyClass));

    if (cudaStatus != cudaSuccess) {
            fprintf(stderr, "cudaMalloc failed!");
    goto label1;
    }

    cudaStatus = cudaMemcpy(darr, arr, size * sizeof(MyClass), cudaMemcpyHostToDevice);
    //arr is a host array

解决方案

There are a few problems here, not all directly related to whatever error you are seeing.

Firstly, you will have to define each class method in both the host and device so that the class can be instantiated in both memory spaces (when you do a copy, only the data members of each instance are copied). So your class declaration should look something like this:

class MyClass
{
    public :
        int value;
        float rate;
        __device__ __host__ MyClass()
        {
            value = 0; rate = 0;
        }
        __device__ __host__ MyClass(int v,float r)
        {
            value = v; rate = r;
        }
        __device__ __host__ ~MyClass() {};
}

You then need to correctly allocate the device memory. If you want an array of MyClass with 10 members on the device, allocate and copy it to the device like this:

MyClass arr[10];
MyClass *darr;
const size_t sz = size_t(10) * sizeof(MyClass);
cudaMalloc((void**)&darr, sz);
cudaMemcpy(darr, &arr[0], sz, cudaMemcpyHostToDevice);

[disclaimer: all code written in browser, never complied or tested, use at own risk]

You can then pass darr as an argument to a kernel from the host.

这篇关于将类对象传递到内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆