平行“折叠”在Haskell [英] parallel "Folding" in Haskell
问题描述
我有一个类型如下的函数:
union :: a - > a - > a
并且 a
具有可加性属性。因此,我们可以将 union
作为(+)
的版本假设我们有
[a]
,并且想要执行并行的folding
我们只能做平行折叠: foldl1'union [a]
但是如何并行执行它?
我可以在 Num
值和(+)
函数中演示问题。例如,我们有一个列表 [1,2,3,4,5,6]
, $ b> code>(+)
同时我们应该拆分
[1 ,2,3](+)[4,5,6]
pre>
[1,2](+)[3](+)[4,5](+)[6]
([ 1](+)[2])(+)([3](+)[4])(+)([5](+)[6])
然后我们要并行执行每个
(+)
操作,并合并来回答[3](+)[7](+)[11] = 21
pre>
请注意,由于
a
可加性,我们分割列表或执行任何操作。 p>
是否有任何方法可以使用任何标准库?
解决方案您需要将您的
union
概括为任何关联的二元运算符⊕ (a⊕ b)⊕ c == a⊕ (b⊕ c)。如果在同一时间你甚至有一个关于&oplus的中性单位元素;你有一个monoid。
关联性的重要方面是你可以任意列表中连续元素的组块,以及⊕他们以任何顺序,因为一个⊕ (b⊕(c⊕ d))==(a⊕ b)⊕ (c⊕ d) - 每个括号可以并行计算;那么你需要减少所有括号的总和,并且你已经对map-reduce进行了排序。
为了使这种并行化感觉,你需要分块操作比&oplus快; - 否则,在做⊕顺序比分块好。其中一种情况是当你有一个随机访问列表 - 比如说一个数组。 Data.Array.Repa 有大量的平行折叠功能。
如果你正在考虑自己实现一个,你需要选择一个好的复杂函数⊕例如:
import Control.Parallel
import Data.List
pfold ::(Num a,Enum a)=> (a - > a - > a) - > [a] - > a
pfold _ [x] = x
pfold mappend xs =(ys`par` zs)`pseq`(ys`mappend` zs)其中
len =长度xs
(ys',zs')= splitAt(len` div` 2)xs
ys = pfold mappend ys'
zs = pfold mappend zs'
main = print $ pfold (+)[foldl'(*)1 [1..x] | x < - [1..5000]]
- 需要比数字
更加复杂的计算 - 因此我们生成许多数字的产品列表
这里我故意使用一个关联操作,它只在本地被称为
mappend
,以表明它可以为一个较弱的概念工作,而不是一个幺半群 - 只有关联性对并行性至关重要;因为无论如何,并行性只对非空列表有意义,不需要mempty
。ghc -O2 -threaded a.hs
a + RTS -N1 -s
总运行时间为8.78秒,而
a + RTS -N2 -s
我的双核笔记本电脑的总运行时间为5.89秒。显然,在这台机器上尝试超过-N2并不意味着什么。
I have a function with type below:
union :: a -> a -> a
And
a
has additivity property. So we can regardunion
as a version of(+)
Say, we have
[a]
, and want to perform a parallel"folding"
, for non-parallel foldling we can do only:foldl1' union [a]
But how to perform it in parallel? I can demonstrate problem on
Num
values and(+)
function.For example, we have a list
[1,2,3,4,5,6]
and(+)
In parallel we should split[1,2,3] (+) [4,5,6] [1,2] (+) [3] (+) [4,5] (+) [6] ([1] (+) [2]) (+) ([3] (+) [4]) (+) ([5] (+) [6])
then each
(+)
operation we want to perform in parallel, and combine to answer[3] (+) [7] (+) [11] = 21
Note, that we split list, or perform operations in any order, because of
a
additivity.Is there any ways to do that using any standard library?
解决方案You need to generalize your
union
to any associative binary operator ⊕ such that (a ⊕ b) ⊕ c == a ⊕ (b ⊕ c). If at the same time you even have a unit element that is neutral with respect to ⊕, you have a monoid.The important aspect of associativity is that you can arbitrarily group chunks of consecutive elements in a list, and ⊕ them in any order, since a ⊕ (b ⊕ (c ⊕ d)) == (a ⊕ b) ⊕ (c ⊕ d) - each bracket can be computed in parallel; then you'd need to "reduce" the "sums" of all brackets, and you've got your map-reduce sorted.
In order for this parallellisation to make sense, you need the chunking operation to be faster than ⊕ - otherwise, doing ⊕ sequentially is better than chunking. One such case is when you have a random access "list" - say, an array. Data.Array.Repa has plenty of parallellized folding functions.
If you are thinking of practicising to implement one yourself, you need to pick a good complex function ⊕ such that the benefit will show.
For example:
import Control.Parallel import Data.List pfold :: (Num a, Enum a) => (a -> a -> a) -> [a] -> a pfold _ [x] = x pfold mappend xs = (ys `par` zs) `pseq` (ys `mappend` zs) where len = length xs (ys', zs') = splitAt (len `div` 2) xs ys = pfold mappend ys' zs = pfold mappend zs' main = print $ pfold (+) [ foldl' (*) 1 [1..x] | x <- [1..5000] ] -- need a more complicated computation than (+) of numbers -- so we produce a list of products of many numbers
Here I deliberately use a associative operation, which is called
mappend
only locally, to show it can work for a weaker notion than a monoid - only associativity matters for parallelism; since parallelism makes sense only for non-empty lists anyway, no need formempty
.ghc -O2 -threaded a.hs a +RTS -N1 -s
Gives 8.78 seconds total run time, whereas
a +RTS -N2 -s
Gives 5.89 seconds total run time on my dual core laptop. Obviously, no point trying more than -N2 on this machine.
这篇关于平行“折叠”在Haskell的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!