埃拉托色尼分段筛分法? [英] Segmented Sieve of Eratosthenes?

查看:27
本文介绍了埃拉托色尼分段筛分法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

制作一个简单的筛子很容易:

It's easy enough to make a simple sieve:

for (int i=2; i<=N; i++){
    if (sieve[i]==0){
        cout << i << " is prime" << endl;
        for (int j = i; j<=N; j+=i){
            sieve[j]=1;
        }
    }
    cout << i << " has " << sieve[i] << " distinct prime factors
";
}

但是当 N 非常大而我无法在内存中保存那种数组时怎​​么办?我查过分段筛法,它们似乎涉及到找到素数直到 sqrt(N) 但我不明白它是如何工作的.如果 N 非常大(比如 10^18)怎么办?

But what about when N is very large and I can't hold that kind of array in memory? I've looked up segmented sieve approaches and they seem to involve finding primes up until sqrt(N) but I don't understand how it works. What if N is very large (say 10^18)?

推荐答案

分段筛的基本思想是选择小于n平方根的筛选素数,选择一个合理大的分段大小仍然适合内存,然后依次筛选每个段,从最小的开始.在第一段,计算段内每个筛分质数的最小倍数,然后按正常方式将筛分质数的倍数标记为合数;当所有的筛选素数都用完后,该段中剩余的未标记数字是素数.然后,对于下一个片段,对于每个筛选素数,您已经知道当前片段中的第一个倍数(它是结束前一个片段中该素数筛选的倍数),因此您筛选每个筛选素数,依此类推直到你完成.

The basic idea of a segmented sieve is to choose the sieving primes less than the square root of n, choose a reasonably large segment size that nevertheless fits in memory, and then sieve each of the segments in turn, starting with the smallest. At the first segment, the smallest multiple of each sieving prime that is within the segment is calculated, then multiples of the sieving prime are marked as composite in the normal way; when all the sieving primes have been used, the remaining unmarked numbers in the segment are prime. Then, for the next segment, for each sieving prime you already know the first multiple in the current segment (it was the multiple that ended the sieving for that prime in the prior segment), so you sieve on each sieving prime, and so on until you are finished.

n 的大小无关紧要,除了较大的 n 比较小的 n 需要更长的筛选时间;重要的大小是段的大小,应该尽可能大(例如,机器上主内存缓存的大小).

The size of n doesn't matter, except that a larger n will take longer to sieve than a smaller n; the size that matters is the size of the segment, which should be as large as convenient (say, the size of the primary memory cache on the machine).

您可以在此处查看分段筛的简单实现.一>.请注意,分段筛选将比另一个答案中提到的 O'Neill 的优先队列筛选快得多;如果您有兴趣,这里有一个实现.

我写这个是为了不同的目的,但我会在这里展示它,因为它可能有用:

I wrote this for a different purpose, but I'll show it here because it might be useful:

虽然 Eratosthenes 的 Sieve 非常快,但它需要 O(n) 空间.通过在连续段中执行筛选,可以将筛选素数减少为 O(sqrt(n)) 加上位阵列的 O(1).在第一段,计算段内每个筛分质数的最小倍数,然后按正常方式将筛分质数的倍数标记为合数;当所有的筛选素数都用完后,该段中剩余的未标记数字是素数.然后,对于下一段,每个筛选素数的最小倍数就是前一段中结束筛选的倍数,这样筛选一直持续到完成.

Though the Sieve of Eratosthenes is very fast, it requires O(n) space. That can be reduced to O(sqrt(n)) for the sieving primes plus O(1) for the bitarray by performing the sieving in successive segments. At the first segment, the smallest multiple of each sieving prime that is within the segment is calculated, then multiples of the sieving prime are marked composite in the normal way; when all the sieving primes have been used, the remaining unmarked numbers in the segment are prime. Then, for the next segment, the smallest multiple of each sieving prime is the multiple that ended the sieving in the prior segment, and so the sieving continues until finished.

考虑以20段为100到200进行筛分的例子.5个筛分素数是3、5、7、11和13.在100到120的第一段中,bitarray有10个槽,槽0对应101,槽k对应100+2k+1,槽9对应119.段中3的最小倍数为105,对应槽2;槽2+3=5和5+3=8也是3的倍数.槽2处5的最小倍数是105,槽2+5=7也是5的倍数.7的最小倍数是105在槽2,槽2+7=9也是7的倍数.依此类推.

Consider the example of sieve from 100 to 200 in segments of 20. The five sieving primes are 3, 5, 7, 11 and 13. In the first segment from 100 to 120, the bitarray has ten slots, with slot 0 corresponding to 101, slot k corresponding to 100+2k+1, and slot 9 corresponding to 119. The smallest multiple of 3 in the segment is 105, corresponding to slot 2; slots 2+3=5 and 5+3=8 are also multiples of 3. The smallest multiple of 5 is 105 at slot 2, and slot 2+5=7 is also a multiple of 5. The smallest multiple of 7 is 105 at slot 2, and slot 2+7=9 is also a multiple of 7. And so on.

函数 primesRange 接受参数 lo、hi 和 delta;lo 和 hi 必须是偶数,且 lo

Function primesRange takes arguments lo, hi and delta; lo and hi must be even, with lo < and lo must be greater than sqrt(hi). The segment size is twice delta. Ps is a linked list containing the sieving primes less than sqrt(hi), with 2 removed since even numbers are ignored. Qs is a linked list containing the offest into the sieve bitarray of the smallest multiple in the current segment of the corresponding sieving prime. After each segment, lo advances by twice delta, so the number corresponding to an index i of the sieve bitarray is lo + 2i + 1.

function primesRange(lo, hi, delta)
    function qInit(p)
        return (-1/2 * (lo + p + 1)) % p
    function qReset(p, q)
        return (q - delta) % p
    sieve := makeArray(0..delta-1)
    ps := tail(primes(sqrt(hi)))
    qs := map(qInit, ps)
    while lo < hi
        for i from 0 to delta-1
            sieve[i] := True
        for p,q in ps,qs
            for i from q to delta step p
                sieve[i] := False
        qs := map(qReset, ps, qs)
        for i,t from 0,lo+1 to delta-1,hi step 1,2
            if sieve[i]
                output t
        lo := lo + 2 * delta

当被调用为 primesRange(100, 200, 10) 时,筛选素数 ps 为 [3, 5, 7, 11, 13];qs 初始为[2, 2, 2, 10, 8],对应最小的倍数105, 105, 105, 121, 117,第二段重置为[1, 2, 6, 0, 11]对应最小的123、125、133、121和143的倍数.

When called as primesRange(100, 200, 10), the sieving primes ps are [3, 5, 7, 11, 13]; qs is initially [2, 2, 2, 10, 8] corresponding to smallest multiples 105, 105, 105, 121 and 117, and is reset for the second segment to [1, 2, 6, 0, 11] corresponding to smallest multiples 123, 125, 133, 121 and 143.

您可以在 http://ideone.com/iHYr1f.除了上面显示的链接之外,如果您对使用质数编程感兴趣,我在我的博客上谦虚地推荐这篇文章.

这篇关于埃拉托色尼分段筛分法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆