推力向量转换涉及相邻元素 [英] Thrust vector transformation involving neighbor elements

查看:143
本文介绍了推力向量转换涉及相邻元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个向量,我想使用CUDA和Thrust转换来做以下事情:

I have a vector, and I would like to do the following, using CUDA and Thrust transformations:

// thrust::device_vector v;
// for k times:
//     calculate constants a and b as functions of k;
//     for (i=0; i < v.size(); i++)
//         v[i] = a*v[i] + b*v[i+1];

我应该如何正确实现?我可以做到的一种方法是让向量w,然后将推力::变换应用于v并将结果保存到w。但是k提前是未知的,我不想创建w1,w2 ...,而浪费大量的GPU内存空间。最好是我想减少数据复制的数量。但是我不确定如何使用一个向量来实现这一点,而各值之间不会互相影响。推力提供了可以做到这一点的东西吗?

How should I correctly implement this? One way I can do it is to have vector w, and apply thrust::transform onto v and save the results to w. But k is unknown ahead of time, and I don't want to create w1, w2, ... and waste a lot of GPU memory space. Preferably I want to minimize the amount of data copying. But I'm not sure how to implement this using one vector without the values stepping on each other. Is there something Thrust provides that can do this?

推荐答案

如果 v.size()足够大,可以充分利用GPU,您可以启动 k 内核来执行此操作,并带有额外的缓冲内存,而无需额外的数据传输。

If the v.size() is large enough to fully utilize the GPU, you could launch k kernels to do this, with a extra buffer mem and no extra data transfer.

thrust::device_vector u(v.size());
for(k=0;;)
{
    // calculate a & b
    thrust::transform(v.begin(), v.end()-1, v.begin()+1, u.begin(), a*_1 + b*_2);
    k++;
    if(k>=K)
        break;

    // calculate a & b
    thrust::transform(u.begin(), u.end()-1, u.begin()+1, v.begin(), a*_1 + b*_2);
    k++;
    if(k>=K)
        break;      
}

这篇关于推力向量转换涉及相邻元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆