CUDA流压缩:理解概念 [英] CUDA Stream compaction: understanding the concept

查看:202
本文介绍了CUDA流压缩:理解概念的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用CUDA /推力/ CUDPP。据我了解,在流压缩,数组中的某些项目被标记为无效,然后选择删除。

I am using CUDA/Thrust/CUDPP. As I understand, in Stream compaction, certain items in an array are marked as invalid and then "removed".

现在什么是去除的真正含义吗?假设原始数组 A 和长度为6。如果2个元素是无效的(通过任何条件,我们可能会提供),那么

Now what does "removal" really mean here? Suppose the original array A and has length 6. If 2 elements are invalid (by whatever condition we may provide) then

  1. 系统是否创建一个尺寸4的GPU,存储阵列来存储有效的元素,以得到最终的结果呢?

  1. Does the system create a new array of size 4 in GPU-memory to store the valid elements to get the final result?

或是否实际从内存中删除无效元素和收缩原始数组 踏踏实实大小4只保留有效的元素呢?

OR does it physically remove the invalid elements from memory and shrink the original array A down to size 4 keeping only the valid elements?

有关这两种情况下,并不意味着是动态内存分配正在发生的事情引擎盖下? 但我听说动态内存分配是不可能的,在CUDA的世界。

For either case, doesn't that mean that dynamic memory allocation is happening under the hood? But I had heard that dynamic memory allocation is not possible in the CUDA world.

推荐答案

首先,动态内存分配,可以在支持CUDA的计算能力2.0和更高版本的设备。 CUDA运行时库支持的malloc /免费新/删除 __ __设备功能。但是,这并不是有密切关系的答案,真的。

First, dynamic memory allocation is possible in CUDA on Compute Capability 2.0 and higher devices. The CUDA runtime library supports malloc/free and new/delete in __device__ functions. But that is not germane to the answer, really.

通常,足够大的输出阵列被提供($ P $对 - 分配,往往大小相同的输入数组),并输出被写入其中。没有动态分配必需的,但有可能存储废物。这是CUDPP和推力做。另一种方法是进行有效元素的计数第一,然后分配动态使用cud​​aMalloc从主机CPU称为输出GPU的存储器。

Typically a large-enough output array is provided (pre-allocated, often the same size as the input array) and the output is written to it. No dynamic allocation required, but there is potentially storage waste. This is what CUDPP and thrust do. An alternative would be to perform a count of valid elements first, then allocate the output GPU memory dynamically using cudaMalloc called from the host CPU.

这篇关于CUDA流压缩:理解概念的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆