埃拉托色尼筛法 - 寻找素数 Python [英] Sieve of Eratosthenes - Finding Primes Python

查看:21
本文介绍了埃拉托色尼筛法 - 寻找素数 Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

澄清一下,这不是家庭作业问题:)

Just to clarify, this is not a homework problem :)

我想为我正在构建的数学应用程序找到质数 &遇到了Eratosthenes筛分方法.

I wanted to find primes for a math application I am building & came across Sieve of Eratosthenes approach.

我已经用 Python 编写了它的实现.但它非常慢.例如,如果我想找到所有小于 200 万的素数.需要> 20 分钟.(我在这一点上停止了它).我怎样才能加快速度?

I have written an implementation of it in Python. But it's terribly slow. For say, if I want to find all primes less than 2 million. It takes > 20 mins. (I stopped it at this point). How can I speed this up?

def primes_sieve(limit):
    limitn = limit+1
    primes = range(2, limitn)

    for i in primes:
        factors = range(i, limitn, i)
        for f in factors[1:]:
            if f in primes:
                primes.remove(f)
    return primes

print primes_sieve(2000)

更新:我最终对这段代码进行了分析 &发现从列表中删除元素花费了相当多的时间.考虑到它必须遍历整个列表(最坏情况)才能找到元素 &然后删除它,然后重新调整列表(也许还有一些副本?).无论如何,我扔掉了字典列表.我的新实现 -

UPDATE: I ended up doing profiling on this code & found that quite a lot of time was spent on removing an element from the list. Quite understandable considering it has to traverse the entire list (worst-case) to find the element & then remove it and then readjust the list (maybe some copy goes on?). Anyway, I chucked out list for dictionary. My new implementation -

def primes_sieve1(limit):
    limitn = limit+1
    primes = dict()
    for i in range(2, limitn): primes[i] = True

    for i in primes:
        factors = range(i,limitn, i)
        for f in factors[1:]:
            primes[f] = False
    return [i for i in primes if primes[i]==True]

print primes_sieve1(2000000)

推荐答案

你没有完全实现正确的算法:

You're not quite implementing the correct algorithm:

在您的第一个示例中,primes_sieve 不维护要罢工/取消设置的素数标志列表(如在算法中),而是连续调整整数列表的大小,这非常昂贵:从列表中删除一个项目需要将所有后续项目向下移动一个.

In your first example, primes_sieve doesn't maintain a list of primality flags to strike/unset (as in the algorithm), but instead resizes a list of integers continuously, which is very expensive: removing an item from a list requires shifting all subsequent items down by one.

在第二个示例中,primes_sieve1 维护了一个包含素数标志的字典,这是朝着正确方向迈出的一步,但它以未定义的顺序遍历字典,并且多余地剔除因子的因子(而不是仅素数的因子,如在算法中).您可以通过对键进行排序并跳过非素数来解决此问题(这已经使其速度提高了一个数量级),但直接使用列表仍然效率更高.

In the second example, primes_sieve1 maintains a dictionary of primality flags, which is a step in the right direction, but it iterates over the dictionary in undefined order, and redundantly strikes out factors of factors (instead of only factors of primes, as in the algorithm). You could fix this by sorting the keys, and skipping non-primes (which already makes it an order of magnitude faster), but it's still much more efficient to just use a list directly.

正确的算法(使用列表而不是字典)类似于:

The correct algorithm (with a list instead of a dictionary) looks something like:

def primes_sieve2(limit):
    a = [True] * limit                          # Initialize the primality list
    a[0] = a[1] = False

    for (i, isprime) in enumerate(a):
        if isprime:
            yield i
            for n in range(i*i, limit, i):     # Mark factors non-prime
                a[n] = False

(请注意,这还包括在素数平方 (i*i) 而不是其双精度数处开始非素数标记的算法优化.)

(Note that this also includes the algorithmic optimization of starting the non-prime marking at the prime's square (i*i) instead of its double.)

这篇关于埃拉托色尼筛法 - 寻找素数 Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆