我如何在Haskell中使用策略编写并行减少? [英] How do I write a parallel reduction using strategies in Haskell?
问题描述
数据并行Haskell有一些内置的,但在列表的通用框架中呢?我们可以用 Control.Parallel.Strategies
?
来做到这一点,假设 f
是关联的,我们该怎么写
parFold ::(a - > a - > a) - > ; [a] - > a
因此 parFold f xs
只需要时间对数 length xs
?
我不认为列表是正确的数据类型这个。因为它只是一个链表,所以必须按顺序访问数据。虽然您可以并行评估这些项目,但在减少步骤中不会获得太多收益。如果你真的需要一个List,我认为最好的函数只是
parFold f = foldl1'f。 withStrategy(parList rseq)
或者可能
parFold f = foldl1'f。 withStrategy(parBuffer 5 rseq)
如果减少步骤很复杂,您可以通过细分像这样的列表:
parReduce f = foldl'f mempty。 reducedList。 chunkList。 withStrategy(parList rseq)
其中
chunkList list = let(l,ls)= splitAt 1000 list in l:chunkList ls
reducedList = parMap rseq(foldl'f mempty)
我冒昧地假设您的数据是 Monoid
for mempty,如果这是不可能的,你可以用你自己的空类型替换mempty,或者更糟的情况是使用 foldl1'
。
这里有两个来自 Control.Parallel.Strategies
的运算符。 parList
并行地评估列表的所有项目。之后, chunkList
将列表分成1000个元素的块。然后每个块都被 parMap
并行地减少。
您也可以尝试
parReduce2 f = foldl'f mempty。 reducedList。 chunkList
其中
chunkList list = let(l,ls)= splitAt 1000 list in l:chunkList ls
reducedList = parMap rseq(foldl'f mempty)
$ c
$ b 取决于工作的分配方式,其中一种可能比其他方式更有效。
如果您可以使用对数据结构具有良好支持的数据结构(数组,矢量,映射等),那么可以对减少步骤执行二进制细分,这可能会更好。 / p>
In high-performance computing, sums, products, etc are often calculated using a "parallel reduction" that takes n elements and completes in O(log n) time (given enough parallelism). In Haskell, we usually use a fold for this kind of calculation, but evaluation time is always linear in the length of the list.
Data Parallel Haskell has some of this built in, but what about in the common framework of a list? Can we do it with Control.Parallel.Strategies
?
So, assuming f
is associative, how do we write
parFold :: (a -> a -> a) -> [a] -> a
so that parFold f xs
only needs time logarithmic in length xs
?
解决方案 I don't think a list is the right data type for this. Because it's just a linked list, the data will necessarily be accessed sequentially. Although you can evaluate the items in parallel, you won't gain much in the reduction step. If you really need a List, I think the best function would be just
parFold f = foldl1' f . withStrategy (parList rseq)
or maybe
parFold f = foldl1' f . withStrategy (parBuffer 5 rseq)
If the reduction step is complex, you might get a gain by subdividing the list like this:
parReduce f = foldl' f mempty . reducedList . chunkList . withStrategy (parList rseq)
where
chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls
reducedList = parMap rseq (foldl' f mempty)
I've taken the liberty of assuming your data is a Monoid
for mempty, if this isn't possible you can either replace mempty with your own empty type, or worse case use foldl1'
.
There are two operators from Control.Parallel.Strategies
in use here. The parList
evaluates all items of the list in parallel. After that, the chunkList
divides the list into chunks of 1000 elements. Each of those chunks is then reduced in parallel by the parMap
.
You might also try
parReduce2 f = foldl' f mempty . reducedList . chunkList
where
chunkList list = let (l,ls) = splitAt 1000 list in l : chunkList ls
reducedList = parMap rseq (foldl' f mempty)
Depending on exactly how the work is distributed, one of these may be more efficient than the others.
If you can use a data structure that has good support for indexing though (Array, Vector, Map, etc.), then you can do binary subdivisions for the reduction step, which will probably be better overall.
这篇关于我如何在Haskell中使用策略编写并行减少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!