推力:填充隔离空间 [英] thrust: fill isolate space

查看:175
本文介绍了推力:填充隔离空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的数组:

0 0 0 1 0 0 0 0 5 0 0 3 0 0 0 8 0 0

0 0 0 1 0 0 0 0 5 0 0 3 0 0 0 8 0 0

我想让每个非零元素自己展开一个元素时间直到达到其他非零元素,结果如下:

I want every non-zero elements to expand themselves one element at a time until it reaches other non-zero elements, the result is like this:

1 1 1 1 1 1 5 5 5 5 3 3 3 3 8 8 8 8

1 1 1 1 1 1 5 5 5 5 3 3 3 3 8 8 8 8

有任何方法推力?

推荐答案


使用thrust有办法吗?

Is there any way to do this using thrust?

是的,这是一种可能的方法。

Yes, here is one possible approach.


  1. 位置,计算2个距离。第一个是到左方向上最近的非零值的距离,第二个是在右方向上到最近的非零值的距离。如果位置本身不为零,则左右距离都将计算为零。我们的基本引擎是分段的包含扫描,一个计算在左到右方向(计算左边的距离为每个零段),另一个计算在相反的方向(计算从右边的距离每个零段)。使用您的示例:

  1. For each position in the sequence, compute 2 distances. The first is the distance to the nearest non-zero value in the left direction, and the second is the distance to the nearest non-zero value in the right direction. If the position itself is non-zero, both left and right distances will be computed as zero. Our basic engine for this will be segmented inclusive scans, one computed in the left to right direction (to compute the distance from the left for each zero segment), and the other computed in the reverse direction (to compute the distance from the right for each zero segment). Using your example:

a vector:    0 0 0 1 0 0 0 0 5 0 0 3 0 0 0 8 0 0
a left dist: ? ? ? 0 1 2 3 4 0 1 2 0 1 2 3 0 1 2
a right dist:3 2 1 0 4 3 2 1 0 2 1 0 3 2 1 0 ? ?

请注意,在每个距离计算中,如果结束不发生,以非零值开始(因为与该方向的距离为未定义)。

Note that in each distance computation, we must special-case one end if that end does not happen to begin with a non-zero value (because the distance from that direction is "undefined"). We will special case those ? distances by assigning them large values, the reason for which will become evident in the next step.

我们现在将创建一个地图向量,对于每个输出位置,它允许我们从属于该输出位置的原始输入向量中选择一个元素。通过取两个计算的距离中的较小者并且从左边或右边调整指数的距离来计算该地图向量:

We now will create a "map" vector, which, for each output position, allows us to select an element from the original input vector that belongs in that output position. This map vector is computed by taking the lesser of the two computed distances, and adjusting the index either from the left or the right, by that distance:

output index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
a left dist:  ? ? ? 0 1 2 3 4 0 1  2  0  1  2  3  0  1  2
a right dist: 3 2 1 0 4 3 2 1 0 2  1  0  3  2  1  0  ?  ?
map vector:   3 3 3 3 3 3 8 8 8 8 11 11 11 11 15 15 15 15


$ b b

对于地图向量计算,如果 a left dist > 一个right dist c $ c>输出索引,并向其添加右dist ,以在该位置生成映射向量元素。否则,我们取输出索引,并从中减去一个左dist 。注意,对于该计算,上面的特殊情况条目应该被认为是任意大。这在代码中通过使用大整数(1 << 30)来模拟。

For the map vector computation, if a left dist > a right dist then we take the output index and add a right dist to it, to produce the map vector element at that position. Otherwise, we take the output index and subtract a left dist from it. Note that the special-case ? entries above should be considered to be "arbitrarily large" for this computation. This is simulated in the code by using a large integer (1<<30).

一旦我们有了地图向量,它从输入到输出向量进行映射复制:

Once we have the map vector, it's a trivial matter to use it to do a mapped copy from input to output vectors:

a vector:    0 0 0 1 0 0 0 0 5 0  0  3  0  0  0  8  0  0
map vector:  3 3 3 3 3 3 8 8 8 8 11 11 11 11 15 15 15 15
out vector:  1 1 1 1 1 1 5 5 5 5  3  3  3  3  8  8  8  8


完全工作的示例:

$ cat t610.cu
#include <thrust/device_vector.h>
#include <thrust/copy.h>
#include <thrust/scan.h>
#include <thrust/iterator/permutation_iterator.h>
#include <thrust/iterator/counting_iterator.h>
#include <thrust/iterator/zip_iterator.h>
#include <thrust/functional.h>
#include <thrust/transform.h>
#include <thrust/sequence.h>
#include <iostream>
#define IVAL (1<<30)

// used to create input vector for prefix sums (distance vector computation)
struct is_zero {

  template <typename T>
  __host__ __device__
  T operator() (T val) {
    return (val) ? 0:1;
    }
};

// inc and dec help with special casing of left and right ends
struct inc {

  template <typename T>
  __host__ __device__
  T operator() (T val) {
    return val+IVAL;
    }
};

struct dec {

  template <typename T>
  __host__ __device__
  T operator() (T val) {
    return val-IVAL;
    }
};

// this functor is lifted from thrust example code
// and is used to enable segmented scans based on flag delimitors
// BinaryPredicate for the head flag segment representation
// equivalent to thrust::not2(thrust::project2nd<int,int>()));
template <typename HeadFlagType>
struct head_flag_predicate : public thrust::binary_function<HeadFlagType,HeadFlagType,bool>
{
    __host__ __device__
    bool operator()(HeadFlagType left, HeadFlagType right) const
    {
        return !right;
    }
};

// distance tuple ordering is left (0), then right (1)
struct map_functor
{
  template <typename T>
  __host__ __device__
  int operator() (T dist){
    int leftdist =  thrust::get<0>(dist);
    int rightdist = thrust::get<1>(dist);
    int idx =       thrust::get<2>(dist);
    return (leftdist > rightdist) ? (idx+rightdist):(idx-leftdist);
    }
};

int main(){

  int h_a[] = { 0, 0, 0, 1, 0, 0, 0, 0, 5, 0, 0, 3, 0, 0, 0, 8, 0, 0 };
  int n = sizeof(h_a)/sizeof(h_a[0]);
  thrust::device_vector<int> a(h_a, h_a+n);
  thrust::device_vector<int> az(n);
  thrust::device_vector<int> asl(n);
  thrust::device_vector<int> asr(n);
  thrust::transform(a.begin(), a.end(), az.begin(), is_zero());
  // set up distance from the  left vector (asl)
  thrust::transform_if(az.begin(), az.begin()+1, a.begin(), az.begin(),inc(), is_zero());
  thrust::transform(a.begin(), a.begin()+1, a.begin(), inc());
  thrust::inclusive_scan_by_key(a.begin(), a.end(), az.begin(), asl.begin(), head_flag_predicate<int>());
  thrust::transform(a.begin(), a.begin()+1, a.begin(), dec());
  thrust::transform_if(az.begin(), az.begin()+1, a.begin(), az.begin(), dec(), is_zero());
  // set up distance from the right vector (asr)
  thrust::device_vector<int> ra(n);
  thrust::sequence(ra.begin(), ra.end(), n-1, -1);
  thrust::transform_if(az.end()-1, az.end(), a.end()-1, az.end()-1, inc(), is_zero());
  thrust::transform(a.end()-1, a.end(), a.end()-1, inc());
  thrust::inclusive_scan_by_key(thrust::make_permutation_iterator(a.begin(), ra.begin()), thrust::make_permutation_iterator(a.begin(), ra.end()), thrust::make_permutation_iterator(az.begin(), ra.begin()), thrust::make_permutation_iterator(asr.begin(), ra.begin()), head_flag_predicate<int>());
  thrust::transform(a.end()-1, a.end(), a.end()-1, dec());
  // create combined map vector
  thrust::device_vector<int> map(n);
  thrust::counting_iterator<int> idxbegin(0);
  thrust::transform(thrust::make_zip_iterator(thrust::make_tuple(asl.begin(), asr.begin(), idxbegin)), thrust::make_zip_iterator(thrust::make_tuple(asl.end(), asr.end(), idxbegin+n)),  map.begin(), map_functor());
  // use map to create output
  thrust::device_vector<int> result(n);
  thrust::copy(thrust::make_permutation_iterator(a.begin(), map.begin()), thrust::make_permutation_iterator(a.begin(), map.end()), result.begin());
  // display results
  std::cout << "Input vector:" << std::endl;
  thrust::copy(a.begin(), a.end(), std::ostream_iterator<int>(std::cout, " "));
  std::cout << std::endl;
  std::cout << "Output vector:" << std::endl;
  thrust::copy(result.begin(), result.end(), std::ostream_iterator<int>(std::cout, " "));
  std::cout << std::endl;
}
$ nvcc -arch=sm_20 -o t610 t610.cu
$ ./t610
Input vector:
0 0 0 1 0 0 0 0 5 0 0 3 0 0 0 8 0 0
Output vector:
1 1 1 1 1 1 5 5 5 5 3 3 3 3 8 8 8 8
$

注意:


  1. 实施可能有一些可以改进的领域,特别是在业务融合方面。然而,为了理解的目的,我认为融合使得代码有点难以阅读。

  1. The above implementation probably has areas that can be improved on, particularly with respect to fusion of operations. However, for understanding purposes, I think fusion makes the code a bit harder to read.

我只是在你给出的例子上测试它。可能有一些错误,你会发现。我的目的不是给你一个黑箱库函数,你使用但不明白,而是教你如何编写自己的代码,做你想要的。

I have really only tested it on the particular example you gave. There may be bugs that you will uncover. My purpose is not to give you a black-box library function that you use but don't understand, but rather to teach you how to write your own code that does what you want.

Jackolantern指出的歧义仍然出现在您的问题陈述中。我通过选择我的地图函子行为来模仿你指定的输出,但只是通过创建一个同样有效但相反的映射函子的实现(使用if a left dist < 一个正确的dist 然后...我可以导致3和8之间的结果采取另一个可能的结果/状态。你的评论是如果有歧义,谁到达位置首先填充它的价值到那个空间对我来说没有意义,除非你的意思是我不在乎你提供什么结果。没有特定线程的概念首先到达特定点。

The "ambiguity" pointed out by JackOLantern is still present in your problem statement. I have obscured it by choosing my map functor behavior to mimic the output you indicated as desired, but simply by creating an equally valid but opposite realization of the map functor (using "if a left dist < a right dist then ..." instead) I can cause the result between 3 and 8 to take the other possible outcome/state. Your comment that "if there is an ambiguity, whoever reaches the position first fill its value to that space" makes no sense to me, unless by that you mean "I don't care which outcome you provide." There is no concept of a particular thread reaching a particular point first. Threads (and blocks) can execute in any order, and this order can change from device to device, and run to run.

这篇关于推力:填充隔离空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆