在CUDA设备代码中使用std :: vector [英] Using std::vector in CUDA device code

查看:465
本文介绍了在CUDA设备代码中使用std :: vector的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题是:有没有办法使用类向量在Cuda内核?当我尝试我得到以下错误:

 错误:调用主机函数(std :: vector< int,std: :__ _ _ _ _ _ global _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ $ c有一种方法使用全局节中的向量? 
我最近尝试了以下操作:


  1. 创建一个新的Cuda项目

  2. 属性

  3. 打开Cuda C / C ++

  4. 转到设备

  5. 代码生成设置为此值:
    compute_20,sm_20

........我可以在我的Cuda内核使用printf标准库函数。



是有办法使用标准库类 vector 在内核代码中支持printf?这是在内核代码中使用printf的一个例子:

  //这个代码只计算数组中的3 b $ b // private_count是一个数组,用于分别保存每个线程的结果

__global__ void countKernel(int * a,int length,int * private_count)
{
printf %d\\\
,threadIdx.x); //它打印线程ID和它的工作

// vector< int> y;
//y.push_back(0);有没有可能做到这一点?

unsigned int offset = threadIdx.x * length;
int i = offset;
for(; i< offset + length; i ++)
{
if(a [i] == 3)
{
private_count [threadIdx.x] ++;
printf(%d,a [i]);
}
}
}


解决方案>

您不能在CUDA中使用STL,但您可以使用 Thrust库做你想要的。否则只需将向量的内容复制到设备并正常操作即可。


The question is that: is there a way to use the class "vector" in Cuda kernels? When I try I get the following error:

error : calling a host function("std::vector<int, std::allocator<int> > ::push_back") from a __device__/__global__ function not allowed

So there a way to use a vector in global section? I recently tried the following:

  1. create a new Cuda project
  2. go to properties of the project
  3. open Cuda C/C++
  4. go to Device
  5. change the value in "Code Generation" to be set to this value: compute_20,sm_20

........ after that I was able to use the printf standard library function in my Cuda kernel.

is there a way to use the standard library class vector in the way printf is supported in kernel code? This is an example of using printf in kernel code:

// this code only to count the 3s in an array using Cuda
//private_count is an array to hold every thread's result separately 

__global__ void countKernel(int *a, int length, int* private_count) 
{
    printf("%d\n",threadIdx.x);  //it's print the thread id and it's working

    // vector<int> y;
    //y.push_back(0); is there a possibility to do this?

    unsigned int offset  = threadIdx.x * length;
    int i = offset;
    for( ; i < offset + length; i++)
    {
        if(a[i] == 3)
        {
            private_count[threadIdx.x]++;
            printf("%d ",a[i]);
        }
    }   
}

解决方案

You can't use the STL in CUDA, but you may be able to use the Thrust library to do what you want. Otherwise just copy the contents of the vector to the device and operate on it normally.

这篇关于在CUDA设备代码中使用std :: vector的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆