平行“折叠"在哈斯克尔 [英] parallel "Folding" in Haskell
问题描述
我有一个类型如下的函数:
I have a function with type below:
union :: a -> a -> a
并且 a
具有 可加性 属性.所以我们可以把union
看成是(+)
And a
has additivity property. So we can regard union
as a version of (+)
比如说,我们有[a]
,想要进行一个并行的"folding"
,对于非并行的折叠我们只能做:
Say, we have [a]
, and want to perform a parallel "folding"
, for non-parallel foldling we can do only:
foldl1' union [a]
但是如何并行执行呢?我可以演示 Num
值和 (+)
函数的问题.
But how to perform it in parallel?
I can demonstrate problem on Num
values and (+)
function.
例如,我们有一个列表 [1,2,3,4,5,6]
和 (+)
同时我们应该拆分
For example, we have a list [1,2,3,4,5,6]
and (+)
In parallel we should split
[1,2,3] (+) [4,5,6]
[1,2] (+) [3] (+) [4,5] (+) [6]
([1] (+) [2]) (+) ([3] (+) [4]) (+) ([5] (+) [6])
然后每个(+)
我们要并行执行的操作,并结合起来回答
then each (+)
operation we want to perform in parallel, and combine to answer
[3] (+) [7] (+) [11] = 21
请注意,由于a
可加性,我们拆分列表或以任何顺序执行操作.
Note, that we split list, or perform operations in any order, because of a
additivity.
有没有办法使用任何标准库来做到这一点?
Is there any ways to do that using any standard library?
推荐答案
您需要将 union
推广到任何关联二元运算符 ⊕使得 (a ⊕ b) ⊕c == a ⊕(b ⊕ c).如果同时你甚至有一个相对于 ⊕ 中性的单位元素,你就有了幺半群.
You need to generalize your union
to any associative binary operator ⊕ such that (a ⊕ b) ⊕ c == a ⊕ (b ⊕ c). If at the same time you even have a unit element that is neutral with respect to ⊕, you have a monoid.
关联性的一个重要方面是您可以将列表中的连续元素块任意分组,并且 ⊕它们以任何顺序排列,因为 ⊕(b ⊕ (c ⊕ d)) == (a ⊕ b) ⊕(c ⊕ d) - 每个括号可以并行计算;然后你需要减少"所有括号的总和",并且你已经对你的 map-reduce 进行了排序.
The important aspect of associativity is that you can arbitrarily group chunks of consecutive elements in a list, and ⊕ them in any order, since a ⊕ (b ⊕ (c ⊕ d)) == (a ⊕ b) ⊕ (c ⊕ d) - each bracket can be computed in parallel; then you'd need to "reduce" the "sums" of all brackets, and you've got your map-reduce sorted.
为了使这种并行化有意义,您需要比 ⊕ 更快的分块操作.- 否则,做 ⊕顺序优于分块.一种这样的情况是当您有一个随机访问列表" - 例如一个数组.Data.Array.Repa 具有大量并行折叠功能.
In order for this parallellisation to make sense, you need the chunking operation to be faster than ⊕ - otherwise, doing ⊕ sequentially is better than chunking. One such case is when you have a random access "list" - say, an array. Data.Array.Repa has plenty of parallellized folding functions.
如果你想练习自己实现一个,你需要选择一个好的复杂函数⊕如此好处就会显现出来.
If you are thinking of practicising to implement one yourself, you need to pick a good complex function ⊕ such that the benefit will show.
例如:
import Control.Parallel
import Data.List
pfold :: (Num a, Enum a) => (a -> a -> a) -> [a] -> a
pfold _ [x] = x
pfold mappend xs = (ys `par` zs) `pseq` (ys `mappend` zs) where
len = length xs
(ys', zs') = splitAt (len `div` 2) xs
ys = pfold mappend ys'
zs = pfold mappend zs'
main = print $ pfold (+) [ foldl' (*) 1 [1..x] | x <- [1..5000] ]
-- need a more complicated computation than (+) of numbers
-- so we produce a list of products of many numbers
这里我特意使用了一个仅在本地称为 mappend
的关联操作,以表明它可以用于比幺半群更弱的概念——只有关联性对并行性很重要;因为并行只对非空列表有意义,所以不需要 mempty
.
Here I deliberately use a associative operation, which is called mappend
only locally, to show it can work for a weaker notion than a monoid - only associativity matters for parallelism; since parallelism makes sense only for non-empty lists anyway, no need for mempty
.
ghc -O2 -threaded a.hs
a +RTS -N1 -s
总运行时间为 8.78 秒,而
Gives 8.78 seconds total run time, whereas
a +RTS -N2 -s
在我的双核笔记本电脑上的总运行时间为 5.89 秒.显然,在这台机器上尝试超过 -N2 是没有意义的.
Gives 5.89 seconds total run time on my dual core laptop. Obviously, no point trying more than -N2 on this machine.
这篇关于平行“折叠"在哈斯克尔的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!