如何用Haskell向量编写并行代码? [英] How to write parallel code with Haskell vectors?
问题描述
一方面,在Haskell Vector a
似乎是用作数字数组的首选类型。甚至还有一个(不完整的) Vector Tutorial 。
另一方面, Control.Parallel.Strategies
主要以 Traversable
的形式定义。向量库不提供这些实例。
Traversable t
的最小完整定义也应定义 Foldable
和
遍历:: Applicative f => (a - > f b) - > t a - > f(t b)
sequenceA :: Applicative f => t(f a) - > f(ta)
我没有看到 sequenceA $ c可以为
Data.Vector.Unboxed.Vector
定义$ c>。那么,用无盒装矢量编写并行代码的最佳方法是什么?定义一些新的临时策略,如 evalVector
或使用 par
和 pseq
显式地或使用普通的 Data.Array
而不是向量?
PS Plain Array
s可以并行处理,不会出现任何问题: https:// gist。 github.com/701888
对于 parVector
但这对我有用:
将限定的Data.Vector导入为V
import Control.Parallel.Strategies
import Control.Parallel
import Control.DeepSeq
ack :: Int - > Int - > Int
ack 0 n = n + 1
ack m 0 = ack(m-1)1
ack mn = ack(m-1)(ack m(n-1))
main = do
let vec = V.enumFromN 1 1000
let res =(V.map(ack 2)vec)`使用`parVector
print res
parVector :: NFData a =>策略(V.Vector a)
parVector vec = eval vec`seq`完成vec
其中
chunkSize = 1
eval v
| vLen == 0 =()
| vLen <= chunkSize = rnf(v.V0!0) - 修改此以处理组块> 1
|否则= eval(V. take half v)`par` eval(V.drop half v)
其中vLen = V.length v
half = vLen`div` 2
$
[tommd @ Mavlo Test] $ ghc --make -O2 -threaded t.hs
...哑警告...
[tommd @ Mavlo Test] $ time ./t + RTS -N1 > / dev / null
real 0m1.962s user 0m1.951s sys 0m0.009s
[tommd @ Mavlo Test] $ time ./t + RTS -N2> / dev / null
real 0m1.119s user 0m2.221s sys 0m0.005s
当我用 $ Integer
而不是 Int
在类型签名中:
[tommd @ Mavlo Test] $ time ./t + RTS -N2> / dev / null
real 0m4.754s
user 0m9.435s
sys 0m0.028s
[tommd @ Mavlo Test] $ time ./t + RTS -N1> / dev / null
real 0m9.008s
user 0m8。 952s
sys 0m0.029s
编辑:和一个解决方案这更接近你以前的尝试更清洁(它不使用三个独立模块的功能)并且效果很好: 从这个解决方案中学习的东西: 性能与第一个 One one hand, in Haskell On the other hand, The minimal complete definition of I don't see how P.S. Plain It's a hack job for And running this code: When I run the code with Rock! EDIT: And a solution that is a bit closer to your earlier attempt is cleaner (it doesn't use functions from three separate modules) and works great: Things to learn from this solution: Performance seems similar to the first 这篇关于如何用Haskell向量编写并行代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
parVector :: NFData a =>策略(V.Vector a)
parVector vec =
让vLen = V.length vec
half = vLen`div` 2
minChunk = 10
in if vLen > minChunk
然后做
let v1 = V.unsafeSlice 0 half vec
v2 = V.unsafeSlice half(vLen - half)vec
parVector v1
parVector v2
return vec
else
evalChunk(vLen-1)>>
return vec
其中
evalChunk 0 = rpar(rdeepseq(vec V.!0))>> return vec
evalChunk i = rpar(rdeepseq(vec V.!i))>> evalChunk(i-1)
Eval
monad,这是严格的,所以我们一定会触发一切(相比于包装 let
并记住使用爆炸模式)。
evalChunk
使用 rpar
和 rdeepseq
(我不相信
rpar vec
会强制任何向量元素)。
切片
取开始索引和长度,而不是开始和结束索引。哎呀!
Control.DeepSeq(NFData)
,但我已经通过电子邮件发送库列表以尝试并修复此问题。
parVector
解决方案类似这个答案,所以我不会发布数字。Vector a
seems to be the preferred type to use as an array of numbers. There is even an (incomplete) Vector Tutorial.Control.Parallel.Strategies
are defined mostly in terms of Traversable
. Vector library doesn't provide these instances.Traversable t
should also define Foldable
andtraverse :: Applicative f => (a -> f b) -> t a -> f (t b)
sequenceA :: Applicative f => t (f a) -> f (t a)
sequenceA
can be defined for Data.Vector.Unboxed.Vector
. So, what is the best approach to writing parallel code with unboxed vectors? Defining some new ad hoc strategies like evalVector
or using par
and pseq
explicitly or using plain Data.Array
instead of vectors?Array
s are parallelizable without problems: https://gist.github.com/701888parVector
but this worked for me:import qualified Data.Vector as V
import Control.Parallel.Strategies
import Control.Parallel
import Control.DeepSeq
ack :: Int -> Int -> Int
ack 0 n = n+1
ack m 0 = ack (m-1) 1
ack m n = ack (m-1) (ack m (n-1))
main = do
let vec = V.enumFromN 1 1000
let res = (V.map (ack 2) vec) `using` parVector
print res
parVector :: NFData a => Strategy (V.Vector a)
parVector vec = eval vec `seq` Done vec
where
chunkSize = 1
eval v
| vLen == 0 = ()
| vLen <= chunkSize = rnf (v V.! 0) -- FIX this to handle chunks > 1
| otherwise = eval (V.take half v) `par` eval (V.drop half v)
where vLen = V.length v
half = vLen `div` 2
[tommd@Mavlo Test]$ ghc --make -O2 -threaded t.hs
... dumb warning ...
[tommd@Mavlo Test]$ time ./t +RTS -N1 >/dev/null
real 0m1.962s user 0m1.951s sys 0m0.009s
[tommd@Mavlo Test]$ time ./t +RTS -N2 >/dev/null
real 0m1.119s user 0m2.221s sys 0m0.005s
Integer
instead of Int
in the type signature:[tommd@Mavlo Test]$ time ./t +RTS -N2 >/dev/null
real 0m4.754s
user 0m9.435s
sys 0m0.028s
[tommd@Mavlo Test]$ time ./t +RTS -N1 >/dev/null
real 0m9.008s
user 0m8.952s
sys 0m0.029s
parVector :: NFData a => Strategy (V.Vector a)
parVector vec =
let vLen = V.length vec
half = vLen `div` 2
minChunk = 10
in if vLen > minChunk
then do
let v1 = V.unsafeSlice 0 half vec
v2 = V.unsafeSlice half (vLen - half) vec
parVector v1
parVector v2
return vec
else
evalChunk (vLen-1) >>
return vec
where
evalChunk 0 = rpar (rdeepseq (vec V.! 0)) >> return vec
evalChunk i = rpar (rdeepseq (vec V.! i)) >> evalChunk (i-1)
Eval
monad, which is strict so we're sure to spark everything (compared to wrapping things in let
and remembering to use bang patterns).evalChunk
forces evaluation of each element using rpar
and rdeepseq
(I don't believe rpar vec
forces any of the vector's elements).slice
takes a start index and length, not a start and end index. Oops!Control.DeepSeq (NFData)
, but I've e-mailed the libraries list to try and fix that issue.parVector
solution in this answer, so I won't post numbers.