如何采取一个数组片断与Repa在一定范围内 [英] How to take an array slice with Repa over a range

查看：216 发布时间：2016/5/30 23:48:57 arrays haskell repa

本文介绍了如何采取一个数组片断与Repa在一定范围内的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图推行以计算积分图像使用Repa的累加值功能。我的当前实现如下所示：

I am attempting to implement a cumulative sum function using Repa in order to calculate integral images. My current implementation looks like the following:

cumsum :: (Elt a, Num a) => Array DIM2 a -> Array DIM2 a
cumsum array = traverse array id cumsum' 
    where
        elementSlice inner outer = slice array (inner :. (0 :: Int))
        cumsum' f (inner :. outer) = Repa.sumAll $ elementSlice inner outer

的问题是在elementSlice功能。在MATLAB或说numpy的，这可能被指定为数组[内，0：外部]。所以，我在找的是沿着线的东西：

The problem is in the elementSlice function. In matlab or say numpy this could be specified as array[inner,0:outer]. So what I am looking for is something along the lines of:

slice array (inner :. (Range 0 outer))

然而，似乎切片不允许在Repa当前指定以上范围。我考虑过使用分区，在高效的并行模板卷积在Haskell的讨论，但是这似乎是一个相当重量级的方法，如果他们将改变每个迭代。我也算是屏蔽片（由二元矢量相乘） - 但这又似乎像它会在大型矩阵非常糟糕，因为执行我会在矩阵中的每个点......

However, it seems that slices are not allowed to be specified over ranges currently in Repa. I considered using partitions as discussed in Efficient Parallel Stencil Convolution in Haskell, but this seems like a rather heavyweight approach if they would be changed every iteration. I also considered masking the slice (multiplying by a binary vector) - but this again seemed like it would perform very poorly on large matrices since I would be allocating a mask vector for every point in the matrix...

我的问题 - 没有人知道是否有计划增加对一系列切片到Repa支持？还是有高性能的方法可以让我已经与不同的方法攻击这个问题，也许？

My question - does anyone know if there are plans to add support for slicing over a range to Repa? Or is there a performant way I can already attack this problem, maybe with a different approach?

推荐答案

其实，我认为主要的问题是，Repa没有扫描原始。（但是，非常相似的文库加速的一样。）有扫描的两个变体，preFIX扫描和后缀扫描。由于一维数组

Actually, I think the main problem is that Repa does not have a scan primitive. (However, a very similar library Accelerate does.) There are two variants of scan, prefix scan and suffix scan. Given a 1D array

[A_1，...，A_N]

preFIX扫描返回

prefix scan returns

[0，A_0，A_0 + A_1，...，A_0 + ... + A_ {N-1}]

而后缀扫描产生

[A_0，A_0 + A_1，...，A_0 + A_1 + ... + A_N]

我想这是你要的东西与你的累计总和（ cumsum ）的功能。

I assume this is what you are going for with your cumulative sum (cumsum) function.

preFIX和后缀扫描推广很自然地到多维数组并基于简树的高效实现。就这个话题比较旧纸是扫描基元向量计算机。此外，Conal埃利奥特最近写几个的博客的的帖子在哈斯克尔推导高效的并行扫描。

Prefix and suffix scans generalise quite naturally to multi-dimensional arrays and have an efficient implementation based on tree reduction. A relatively old paper on the topic is "Scan Primitives for Vector Computers". Also, Conal Elliott has recently written several blog posts on deriving efficient parallel scans in Haskell.

的积分图像（2D阵列上）可以通过执行两次扫描一水平和一个垂直地进行计算。在没有扫描原始的我已经实现的，效率很低。

The integral image (on a 2D array) can be calculated by doing two scans, one horizontally and one vertically. In the absence of a scan primitive I have implemented one, very inefficiently.

horizScan :: Array DIM2 Int -> Array DIM2 Int
horizScan arr = foldl addIt arr [0 .. n - 1]
  where 
    addIt :: Array DIM2 Int -> Int -> Array DIM2 Int
    addIt accum i = accum +^ vs
       where 
         vs = toAdd (i+1) n (slice arr (Z:.All:.i))
    (Z:.m:.n) = arrayExtent arr

--
-- Given an @i@ and a length @len@ and a 1D array @arr@ 
-- (of length n) produces a 2D array of dimensions n X len.
-- The columns are filled with zeroes up to row @i@.
-- Subsequently they are filled with columns equal to the 
-- values in @arr.
--
toAdd :: Int -> Int -> Array DIM1 Int -> Array DIM2 Int
toAdd i len arr = traverse arr (\sh -> sh:.(len::Int)) 
               (\_ (Z:.n:.m) -> if m >= i then arr ! (Z:.n) else 0)

，用于计算积分图像的功能可以被定义为

The function for calculating the integral image can then be defined as

vertScan :: Array DIM2 Int -> Array DIM2 Int
vertScan = transpose . horizScan . transpose

integralImage = horizScan . vertScan

由于扫描已实施加速，它不应该太硬将其添加到Repa。我不知道，如果使用现有的Repa原语的高效实现是可能或没有。

Given that scan has been implemented for Accelerate, it shouldn't be too hard to add it to Repa. I am not sure if an efficient implementation using the existing Repa primitives is possible or not.

这篇关于如何采取一个数组片断与Repa在一定范围内的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何采取一个数组片断与Repa在一定范围内 [英] How to take an array slice with Repa over a range

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何采取一个数组片断与Repa在一定范围内 [英] How to take an array slice with Repa over a range

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭