OpenCL/CUDA:两阶段归约算法 [英] OpenCL/CUDA: Two-stage reduction Algorithm

查看：142 发布时间：2020/5/20 18:59:06 algorithm opencl

本文介绍了OpenCL/CUDA:两阶段归约算法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

可以通过调用__reduce()来减少大型数组.多次.

Reduction of large arrays can be done by calling __reduce(); multiple times.

但是，以下代码仅使用两个阶段，并记录在

The following code however uses only two stages and is documented here:

但是，我无法理解此两阶段归约的算法.可以给一个简单的解释吗?

However I am unable to understand the algorithm for this two stage reduction. can some give a simpler explanation?

__kernel
void reduce(__global float* buffer,
        __local float* scratch,
        __const int length,
        __global float* result) {

    int global_index = get_global_id(0);
    float accumulator = INFINITY;
    // Loop sequentially over chunks of input vector
    while (global_index < length) {
        float element = buffer[global_index];
        accumulator = (accumulator < element) ? accumulator : element;
        global_index += get_global_size(0);
    }

    // Perform parallel reduction
    int local_index = get_local_id(0);
    scratch[local_index] = accumulator;
    barrier(CLK_LOCAL_MEM_FENCE);
    for(int offset = get_local_size(0) / 2; offset > 0; offset = offset / 2) {
        if (local_index < offset) {
            float other = scratch[local_index + offset];
            float mine = scratch[local_index];
            scratch[local_index] = (mine < other) ? mine : other;
        }
        barrier(CLK_LOCAL_MEM_FENCE);
    }
    if (local_index == 0) {
        result[get_group_id(0)] = scratch[0];
    }
}

使用CUDA也可以很好地实现它.

It can also be well implemented using CUDA.

OpenCL/CUDA:两阶段归约算法 [英] OpenCL/CUDA: Two-stage reduction Algorithm

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

OpenCL/CUDA:两阶段归约算法 [英] OpenCL/CUDA: Two-stage reduction Algorithm

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭