CUDA中的累积和 [英] Cumulative summation in CUDA
本文介绍了CUDA中的累积和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
有人可以指出我在正确的方向上如何做这种类型的计算并行,或告诉我这个方法的一般名称是什么?我不认为这些会返回相同的结果。
Can someone please point me in the right direction on how to do this type of calculation in parallel, or tell me what the general name of this method is? I don't think these will return the same result.
C ++
for (int i = 1; i < width; i++)
x[i] = x[i] + x[i-1];
CUDA
int i = blockIdx.x * blockDim.x + threadIdx.x
if ((i > 0) && (i < (width)))
X[i] = X[i] + X[i-1];
推荐答案
其中 x [i]
的最终值是所有值的总和 x [0] ... x [i]
This looks like a cumulative sum operation, in which the final value of x[i]
is the sum of all values x[0]...x[i]
in the original array.
在CUDA中,这被称为扫描或前缀和操作,并且可以有效地并行化。参见例如本讲座的示例。
In CUDA, this is called a scan or prefix-sum operation, and it can be efficiently parallelized. See e.g. this lecture for examples.
这篇关于CUDA中的累积和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文