列表,阵列和可变阵列之间的Eratosthenes筛选的理想实现是什么? [英] What's the ideal implementation for the Sieve of Eratosthenes between Lists, Arrays, and Mutable Arrays?
问题描述
现在我的问题是,应该在哪种情况下使用哪一种?
更正我的初始推理也是有帮助的: 我假设列表1是最习惯且容易阅读的为一个Haskeller。虽然这是正确的吗?我想知道它是否会遇到与另一个基于列表的筛选相同的问题,然后我学到了实际上并未实现该算法:
$ b $ em(编辑:此处显示的是基于列表的筛选器I知道有问题,而不是来自RosettaCode的,我将其粘贴在底部)
primes = sieve [2 ..]其中
sieve(p:x)= p:sieve [n | n <-x,n`mod` p> 0]
在性能方面,不可变数组似乎是赢家。上限 m
2000000
,时间约为:
- 1.3s为List
- 0.6s为数组
所以我会选择Array来获得性能。
当然,可变数组一个也很容易推理,因为我有一个更强制性的语言背景。我不确定为什么我会选择这个,如果我在Haskell编码的话,因为它比别人慢,而且非惯用。
复制代码此处仅供参考:
列表:
primesTo m = 2:eratos [3,5..m]其中
eratos(p:xs)| p * p> m = p:xs
| True = p:eratos(xs`minus` [p * p,p * p + 2 * p ..])
减a @(x:xs)b @(y:ys)=案例比较xy
LT - > x:减xs b
EQ - >减去xs ys
GT - >减去ys
减去ab = a
不可变数组:
import Data.Array.Unboxed
primesToA m = sieve 3(array(3,m) [(i,odd i)| i< - [3..m]] :: UArray Int Bool)
where
sieve pa
| p * p> m = 2:[i | (i,True)< - 相关a]
| a!p = sieve(p + 2)$ a // [(i,False)| i< - [p * p,p * p + 2 * p..m]]
|否则=筛分(p + 2)a
可变数组: p>
import Control.Monad(forM_,when)
import Control.Monad.ST
导入Data.Array。 ST
导入Data.Array.Unboxed
primeSieve :: Integer - > UArray Integer Bool
primeSieve top = runSTUArray $ do
a< - newArray(2,top)True - :: ST s(STUArray s Integer Bool)
let r = ceiling。 sqrt $ fromInteger top
forM_ [2..r] $ \i - > do
ai< - readArray a i
当ai $ do
forM_ [i * i,i * i + i..top] $ \j - > do
writeArray a j False
返回a
- 从列表中返回素数列表:
primesTo :: Integer - > [Integer]
primesTo top = [p | (p,True)< - assocs $ primeSieve top]
编辑
- 我把特纳的筛子展示在顶部,但这不是我与其他两个比较的基于列表的示例。我只是想知道基于列表的示例是否与Turner's有相同的不正确的Eratosthenes筛问题。
- 看起来性能比较是不公平的,因为可变数组示例code> Integer 而不是
Int
...
primes =>
解决方案筛子[2 ..]其中
筛子(p:x)= p:sieve [n | n <-x,n`mod` p> 0]
不是筛子。这是非常低效的审判分裂。 不要使用!
我很好奇你是如何得到你的时间的,Turner筛没有办法只需几秒钟即可产生不超过200万的素数。让它找到20万的质数,就得到了
$ bMUT时间6.38s(已过6.39s)
GC时间9.19s(已过9.20s)
EXIT时间0.00s(已过去0.00s)
总时间15.57s(已过去15.59s)
在我的机器上(64位Linux,ghc-7.6.1,用-O2编译)。该算法的复杂性是
O(N²/log²N)
,几乎是二次的。让它进行到2,000,000将需要大约二十分钟。
阵列版本的时间也是可疑的,尽管在另一个方向。你测量了解码代码吗?
筛选到2,000,000,使用优化编译,可变数组代码运行0.35秒,而不可变数组代码运行0.12秒。
现在,它的可变数组仍然是不可变数组的三倍左右。
但是,这是一个不公平的比较。对于不可变数组,您使用
Int
s和可变数组Integer
s。更改可变数组代码以使用Int
s - 因为它应该是,因为在底层,数组是Int
-indexed ,所以使用Integer
是一个不必购买任何东西的性能牺牲品 - 让可变数组代码在0.15秒内运行。接近可变数组代码,但不是那里。然而,你让可变数组做更多的工作,因为在不可变数组代码中,你只能消除奇素数的奇数倍,但在可变数组代码中,你标记所有素数的所有倍数。改变可变数组代码以特别处理2,并且仅消除奇数素数的奇数倍也将其降低到0.12秒。
但是,您正在使用范围检查数组索引,这是很慢的,并且由于索引的有效性在代码本身中被检查,所以不必要。将其改为使用
unsafeRead
和unsafeWrite
可将不可变数组的时间降低到0.09秒。
然后你有这样的问题:使用
forM_ [x,y .. z]
使用方格
Int
s(幸运的是,GHC消除了列表)。自己编写一个循环,以便只使用未装箱的Int#
s,时间将减少到0.02秒。
<$
import Control.Monad(forM_,when)
import Control.Monad.ST
import Data.Array。pre ${ - #LANGUAGE MonoLocalBinds# - }
import Data.Array。 ST
导入Data.Array.Unboxed
导入Data.Array.Base
primeSieve :: Int - > UArray Int Bool
primeSieve top = runSTUArray $ do
a < - newArray(0,top)True
unsafeWrite a 0 False
unsafeWrite a 1 False
let r =天花板。 sqrt $ fromIntegral top
标记步骤idx
|顶部< idx = return()
|否则= do
unsafeWrite a idx False
标记步骤(idx + step)
筛选p
| r< p =返回
| (2 * p)(p * p)
筛选(p + 2)
标记2 4 $ b(否则= do
prim < - unsafeRead ap
$ b sift 3
- 从列表中返回素数列表:
primesTo :: Int - > [Int]
primesTo top = [p | (p,True)< - assocs $ primeSieve top]
main :: IO()
main = print.last $ primesTo 2000000
因此,对于Eratosthenes的Sieve来说,你应该使用一个数组下一个短暂的时间。
你使用不可变数组获得了非常简单和直接的代码,并且代码的体积不会太高(对于更高的代码,限制,因为数组仍然被复制和垃圾收集,但这并不算太坏)。
当你需要更好的性能时,你需要可变数组。编写高效的可变数组代码并不是完全微不足道的,必须知道编译器如何翻译各种结构以选择正确的结构,有些人会认为这种代码是单一的。但是你也可以使用一个库(免责声明:我写它),它提供了一个相当有效的实现,而不是自己编写它。
In Haskell, I've found three simple implementations of the Sieve of Eratosthenes on the Rosetta Code page.
Now my question is, which one should be used in which situations?
Correcting my initial reasoning would be helpful too:
I'm assuming the List one is the most idiomatic and easy to read for a Haskeller. Is it correct, though? I'm wondering if it suffers from the same problems as another list-based sieve that I then learned was not actually implementing the algorithm:
(edit: shown here is the list-based sieve I know has problems, not the one from RosettaCode, which I pasted at the bottom)primes = sieve [2..] where sieve (p:x) = p : sieve [ n | n <- x, n `mod` p > 0 ]
In terms of performance, the immutable Array seems to be the winner. With an upper bound
m
of2000000
, the times were about:- 1.3s for List
- 0.6s for Array
- 1.8s for Mutable Array
So I'd pick Array for performance.
And of course, the Mutable Array one is also easy to reason about since I have a more imperative language background. I'm not sure why I would pick this one if I'm coding in Haskell, though, since it's both slower than the others and non-idiomatic.
Code copied here for reference:
List:
primesTo m = 2 : eratos [3,5..m] where eratos (p : xs) | p*p>m = p : xs | True = p : eratos (xs `minus` [p*p, p*p+2*p..]) minus a@(x:xs) b@(y:ys) = case compare x y of LT -> x : minus xs b EQ -> minus xs ys GT -> minus a ys minus a b = a
Immutable Array:
import Data.Array.Unboxed primesToA m = sieve 3 (array (3,m) [(i,odd i) | i<-[3..m]] :: UArray Int Bool) where sieve p a | p*p > m = 2 : [i | (i,True) <- assocs a] | a!p = sieve (p+2) $ a//[(i,False) | i <- [p*p, p*p+2*p..m]] | otherwise = sieve (p+2) a
Mutable Array:
import Control.Monad (forM_, when) import Control.Monad.ST import Data.Array.ST import Data.Array.Unboxed primeSieve :: Integer -> UArray Integer Bool primeSieve top = runSTUArray $ do a <- newArray (2,top) True -- :: ST s (STUArray s Integer Bool) let r = ceiling . sqrt $ fromInteger top forM_ [2..r] $ \i -> do ai <- readArray a i when ai $ do forM_ [i*i,i*i+i..top] $ \j -> do writeArray a j False return a -- Return primes from sieve as list: primesTo :: Integer -> [Integer] primesTo top = [p | (p,True) <- assocs $ primeSieve top]
EDIT
- I showed Turner's Sieve at the top but that's not the list-based example I'm comparing with the other two. I just wanted to know if the list-based example suffers from the same "not the correct Sieve of Eratosthenes" problems as Turner's.
- It appears the performance comparison is unfair because the Mutable Array example goes through evens as well and uses
Integer
rather thanInt
...
解决方案This
primes = sieve [2..] where sieve (p:x) = p : sieve [ n | n <- x, n `mod` p > 0 ]
is not a sieve. It's very inefficient trial division. Don't use that!
I'm curious about how you got your times, there is no way that the Turner "sieve" could produce the primes not exceeding 2,000,000 in mere seconds. Letting it find the primes to 200,000 took
MUT time 6.38s ( 6.39s elapsed) GC time 9.19s ( 9.20s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 15.57s ( 15.59s elapsed)
on my box (64-bit Linux, ghc-7.6.1, compiled with -O2). The complexity of that algorithm is
O(N² / log² N)
, almost quadratic. Letting it proceed to 2,000,000 would take about twenty minutes.Your times for the array versions are suspicious too, though in the other direction. Did you measure interpreted code?
Sieving to 2,000,000, compiled with optimisations, the mutable array code took 0.35 seconds to run, and the immutable array code 0.12 seconds.
Now, that still has the mutable array about three times slower than the immutable array.
But, it's an unfair comparison. For the immutable array, you used
Int
s, and for the mutable arrayInteger
s. Changing the mutable array code to useInt
s - as it should, since under the hood, arrays areInt
-indexed, so usingInteger
is an unnecessary performance sacrifice that buys nothing - made the mutable array code run in 0.15 seconds. Close to the mutable array code, but not quite there. However, you let the mutable array do more work, since in the immutable array code you only eliminate odd multiples of the odd primes, but in the mutable array code, you mark all multiples of all primes. Changing the mutable array code to treat 2 specially, and only eliminate odd multiples of odd primes brings that down to 0.12 seconds too.But, you're using range-checked array indexing, which is slow, and, since the validity of the indices is checked in the code itself, unnecessary. Changing that to using
unsafeRead
andunsafeWrite
brings down the time for the immutable array to 0.09 seconds.Then you have the problem that using
forM_ [x, y .. z]
uses boxed
Int
s (fortunately, GHC eliminates the list). Writing a loop yourself, so that only unboxedInt#
s are used, the time goes down to 0.02 seconds.{-# LANGUAGE MonoLocalBinds #-} import Control.Monad (forM_, when) import Control.Monad.ST import Data.Array.ST import Data.Array.Unboxed import Data.Array.Base primeSieve :: Int -> UArray Int Bool primeSieve top = runSTUArray $ do a <- newArray (0,top) True unsafeWrite a 0 False unsafeWrite a 1 False let r = ceiling . sqrt $ fromIntegral top mark step idx | top < idx = return () | otherwise = do unsafeWrite a idx False mark step (idx+step) sift p | r < p = return a | otherwise = do prim <- unsafeRead a p when prim $ mark (2*p) (p*p) sift (p+2) mark 2 4 sift 3 -- Return primes from sieve as list: primesTo :: Int -> [Int] primesTo top = [p | (p,True) <- assocs $ primeSieve top] main :: IO () main = print .last $ primesTo 2000000
So, wrapping up, for a Sieve of Eratosthenes, you should use an array - not surprising, its efficiency depends on being able to step from one multiple to the next in short constant time.
You get very simple and straightforward code with immutable arrays, and that code performs decently for not too high limits (it gets relatively worse for higher limits, since the arrays are still copied and garbage-collected, but that's not too bad).
When you need better performance, you need mutable arrays. Writing efficient mutable array code is not entirely trivial, one has to know how the compiler translates the various constructs to choose the right one, and some would consider such code unidiomatic. But you can also use a library (disclaimer: I wrote it) that provides a fairly efficient implementation rather than writing it yourself.
这篇关于列表,阵列和可变阵列之间的Eratosthenes筛选的理想实现是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!