设备内存上的推力减小结果 [英] thrust reduction result on device memory

查看:8
本文介绍了设备内存上的推力减小结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以将thrust::reduce 操作的返回值留在设备分配的内存中?如果是的话,是否像将值分配给 cudaMalloc'ed 区域一样简单,还是应该使用推力::device_ptr?

Is it possible to leave the return value of a thrust::reduce operation in device-allocated memory? In case it is, is it just as easy as assigning the value to a cudaMalloc'ed area, or should I use a thrust::device_ptr?

推荐答案

是否可以将推力::reduce 操作的返回值留在设备分配的内存中?

Is it possible to leave the return value of a thrust::reduce operation in device-allocated memory?

简短的回答是否定的.

thrust reduce 返回一个数量,即减少的结果.此数量必须存放在主机常驻变量中:

thrust reduce returns a quantity, the result of the reduction. This quantity must be deposited in a host resident variable:

以reduce为例,它是同步的并且总是将其结果返回给 CPU:

Take for example reduce, which is synchronous and always returns its result to the CPU:

template<typename Iterator, typename T> 
T reduce(Iterator first, Iterator last, T init); 

一旦运算结果返回给 CPU,你可以根据需要将其复制到 GPU:

Once the result of the operation has been returned to the CPU, you can copy it to the GPU if you like:

#include <iostream>
#include <thrust/device_vector.h>
#include <thrust/reduce.h>

int main(){

    thrust::device_vector<int> data(256, 1);
    thrust::device_vector<int> result(1);
    result[0] = thrust::reduce(data.begin(), data.end());
    std::cout << "result = " << result[0] << std::endl;
    return 0;
}

另一种可能的替代方法是使用 thrust::reduce_by_key 它将减少结果返回到设备内存,而不是复制到主机内存.如果您对整个数组使用单个键,则最终结果将是单个输出,类似于 thrust::reduce

Another possible alternative is to use thrust::reduce_by_key which will return the reduction result to device memory, rather than copy to host memory. If you use a single key for your entire array, the net result will be a single output, similar to thrust::reduce

这篇关于设备内存上的推力减小结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆