计算 CUDA 数组中数字的出现次数 [英] Counting occurrences of numbers in a CUDA array

查看:31
本文介绍了计算 CUDA 数组中数字的出现次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用 CUDA 存储在 GPU 上的无符号整数数组(通常是 1000000 元素).我想计算数组中每个数字的出现次数.只有几个不同的数字(大约 10),但这些数字可以从 1 到 1000000.大约9/10的数字是0,我不需要他们的计数.结果如下所示:

I have an array of unsigned integers stored on the GPU with CUDA (typically 1000000 elements). I would like to count the occurrence of every number in the array. There are only a few distinct numbers (about 10), but these numbers can span from 1 to 1000000. About 9/10th of the numbers are 0, I don't need the count of them. The result looks something like this:

58458 -> 1000 occurrences
15 -> 412 occurrences

我有一个使用 atomicAdds 的实现,但它太慢了(很多线程写入同一个地址).有人知道快速/有效的方法吗?

I have an implementation using atomicAdds, but it is too slow (a lot of threads write to the same address). Does someone know of a fast/efficient method?

推荐答案

您可以通过首先对数字进行排序,然后进行键控归约来实现直方图.

You can implement a histogram by first sorting the numbers, and then doing a keyed reduction.

最直接的方法是使用thrust::sort,然后使用thrust::reduce_by_key.它通常也比基于原子的临时分箱快得多.这是一个示例.

The most straightforward method would be to use thrust::sort and then thrust::reduce_by_key. It's also often much faster than ad hoc binning based on atomics. Here's an example.

这篇关于计算 CUDA 数组中数字的出现次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆