从阵列中平行移除元素 [英] parallel removal of elements from an array

查看:92
本文介绍了从阵列中平行移除元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想尝试并行版本的 STL remove_if 。我做的是在全局内存中创建一个计数器,并让每个线程在一个元素上工作。如果该元素不等于键,那么它将被复制到结果数组中,索引由计数器通过原子添加确定。有没有更好的选择,以避免频繁的原子操作?

I'm trying to do a parallel version of STL remove_if. What I did is to create a counter in the global memory, and let each thread work on one element. If that element is not equal to the key, then it will be copied to the result array, with an index determined by the counter through an atomic add. Is there any better alternative to avoid frequent atomic operations?

我发现推力库也有一个remove_if,但我感到非常困惑的源代码位于thrust\detail\backend\cpp\remove.h目录:

I've found that the thrust library also has a remove_if, but I feel very confused about the source code located in "thrust\detail\backend\cpp\remove.h" directory:

template<typename ForwardIterator,
     typename InputIterator,
     typename Predicate>
ForwardIterator remove_if(ForwardIterator first,
                        ForwardIterator last,
                        InputIterator stencil,
                        Predicate pred)
{
// advance iterators until pred(*stencil) is true or we reach the end of input
while(first != last && !bool(pred(*stencil)))
{
    ++first;
    ++stencil;
}

if(first == last)
    return first;

// result always trails first 
ForwardIterator result = first;

++first;
++stencil;

while(first != last)
{
    if(!bool(pred(*stencil)))
    {
        *result = *first;
        ++result;
    }
    ++first;
    ++stencil;
}

return result;
}

这不是按顺序执行元素移除吗?

Isn't this performing the element removal in sequence?

感谢任何建议!

推荐答案

除非你有一个令人信服的理由实现,我建议你只使用Thrust remove_if()。推力是建立在STL上的,如果你对通用性的要求是相似的,你将会编写看起来非常类似于Thrust源代码的代码。

Unless you have a compelling reason to roll your own implementation, I recommend you just use Thrust remove_if(). Thrust is modeled on the STL and if your requirements for generality are similar, you will wind up writing code that looks very similar to the Thrust source code.

如果性能推力不令人满意,推力社区(包括主要作者)可能对如何制定你的代码以提高性能提出了很好的建议。

If the performance of Thrust isn't satisfactory, the Thrust community (including the principal authors) might have good suggestions on how to formulate your code for better performance.

如果你有垂直应用程序和推力不够快 - 滚动基于扫描的实现作为最后手段。算法的一行概要是对谓词的逆执行并行前缀和(scan) - 要保留的元素的输出索引由扫描的相应元素指定。

Failing that - if you have a vertical application and Thrust isn't fast enough - roll a scan-based implementation as a last resort. The one-line summary of the algorithm is to do a parallel prefix sum ("scan") on the inverse of the predicate - the output index of elements you want to keep is then specified by the corresponding element of the scan.

这篇关于从阵列中平行移除元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆