CUDA内核调用的参数超过256字节的想法 [英] Ideas for CUDA kernel calls with parameters exceeding 256 bytes

查看:242
本文介绍了CUDA内核调用的参数超过256字节的想法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个结构,总计超过256个字节的大小允许作为参数传递到内核调用。



这两个结构已经被分配和复制

1)如何在这些结构的同一内核中使用,而不作为参数传递?



更多详情。单独地,这些结构可以作为参数传递。例如,在不同的内核中。但是:



2)如何在同一个内核中使用这两个结构?

解决方案

正如罗伯特·克罗维拉在他的评论中建议的,你应该能够传递一个指针到这些区域。
我在opencl中也有类似的问题
这是我如何实现struct:



(我的内核和主机函数都在opencl,



以下两个在我的Mapper.c - >主机函数中定义

p>

  typedef struct data 
{
double dattr [10];
int d_id;
int bestCent;
} Data;


typedef struct cent
{
double cattr [5];
int c_id;
} Cent;

Data * dataNode;
Cent * centNode;

在设备的全局内存上分配内存后,我传输了数据。
我必须在我的其他内核函数中重新定义结构体定义,如下所示:



mapper.cl:

  #pragma OPENCL EXTENSION cl_khr_fp64:enable 
typedef struct data
{
double dattr [10];
int d_id;
int bestCent;
} Data;


typedef struct cent
{
double cattr [5];
int c_id;
} Cent;

__kernel void mapper(__ global int * keyMobj,__global int * valueMobj,__ global Data * dataMobj,__ global Cent * centMobj)
{
int i = get_global_id(0);
int j,k,color = 0;
double dmin = 1000000.0,dx;
for(j = 0; j <2; j ++)//这里2是考虑的质心数量
{
dx = 0.0;
for(k = 0; k <2; k ++)
{
dx + =((centMobj [j] .cattr [k]) - (dataMobj [i] .dattr [k] ))*((centMobj [j] .cattr [k]) - (dataMobj [i] .dattr [k]))
}
if(dx {color = j;
dmin = dx;
}
}
keyMobj [i] = color;
valueMobj [i] = dataMobj [i]。d_id;

}

你可以看到我只传递了指向这些区域的指针.. ie keyMobj和valueMobj。

  kernel = clCreateKernel(program,mapper,& ret); 
ret = clSetKernelArg(kernel,0,sizeof(cl_mem),(void *)& keyMobj);
ret = clSetKernelArg(kernel,1,sizeof(cl_mem),(void *)& valueMobj);
ret = clSetKernelArg(kernel,2,sizeof(cl_mem),(void *)& dataMobj);
ret = clSetKernelArg(kernel,3,sizeof(cl_mem),(void *)& centMobj);

上面的代码行属于主机端函数(mapper.c),它创建内核函数.cl)..接下来的4行(clSetKernelArg ..)将参数传递给内核函数。


I have a couple of structures that summed up exceed the 256 bytes size allowed to be passed as parameters in a kernel call.

Both structures are already allocated and copied to device global memory.

1) How can I make use in the same kernel of these structures without being passed as parameters?

More details. Separately, these structures can be passed as parameters. For example, in different kernels. But:

2) How can I use both structures in the same kernel?

解决方案

As Robert Crovella suggested in his comment, you should just be able to pass a pointer to those areas. I have had similar problem in opencl.. This is how I implemented the struct:

(My kernel and host functions are in opencl, syntax can be the issue for you..but the context is same.!)

Following two are defined in my 'Mapper.c'--> Host function

typedef struct data
{
  double dattr[10];
  int d_id;
  int bestCent;
}Data;


typedef struct cent
{
  double cattr[5];
  int c_id;
}Cent;

Data *dataNode;
Cent *centNode;

After allocating memory on Device's global memory, I transferred the data. I had to redefine the struct definitions in my other kernel function as below:

mapper.cl:

#pragma OPENCL EXTENSION cl_khr_fp64 : enable
typedef struct data
{
  double dattr[10];
  int d_id;
  int bestCent;
}Data;


typedef struct cent
{
  double cattr[5];
  int c_id;
}Cent;

__kernel void mapper(__global int *keyMobj, __global int *valueMobj,__global Data *dataMobj,__global Cent *centMobj)
{
    int i= get_global_id(0);
    int j,k,color=0;
    double dmin=1000000.0, dx;
    for(j=0; j<2; j++)      //here 2 is number of centroids considered
     {
        dx = 0.0;
        for(k=0; k<2; k++)
        {
           dx+= ((centMobj[j].cattr[k])-(dataMobj[i].dattr[k])) * ((centMobj[j].cattr[k])-(dataMobj[i].dattr[k]));
        }  
        if(dx<dmin)            
        {  color = j;   
           dmin = dx;
        }
     }  
     keyMobj[i] = color;
     valueMobj[i] = dataMobj[i].d_id;

}

You can see that I have passed only pointer to those areas.. i.e. keyMobj and valueMobj.

kernel = clCreateKernel(program, "mapper", &ret);
ret = clSetKernelArg(kernel, 0, sizeof(cl_mem), (void *)&keyMobj);
ret = clSetKernelArg(kernel, 1, sizeof(cl_mem), (void *)&valueMobj);
ret = clSetKernelArg(kernel, 2, sizeof(cl_mem), (void *)&dataMobj);
ret = clSetKernelArg(kernel, 3, sizeof(cl_mem), (void *)&centMobj);

Above lines of code is belongs to host side function(mapper.c) which creates kernel function(mapper.cl)..and next 4 lines (clSetKernelArg..) passes the arguments to the kernel function.

这篇关于CUDA内核调用的参数超过256字节的想法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆