为什么我的筛子在寻找素数时表现不佳? [英] Why does my sieve not perform well for finding primes?

查看:88
本文介绍了为什么我的筛子在寻找素数时表现不佳?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了两个主要查找器功能,而筛子的性能仅好约10%.对于简单版本,我正在使用两种优化方法.

I wrote two prime finder functions and the sieve only performs about 10% better. I'm using two optimizations for the simple version.

  • 不检查偶数
  • 仅检查平方根或 j * j <= i . (等效)

和筛版的一项优化

  • 仅检查平方根或 i * i <= n . (等效)

我可以在筛子上添加哪些优化?

What optimizations can I add to the sieve?

我的筛子非常慢.我现在还不想做按位的实现,我想了解一下这种实现是否带来任何好处.

My sieve is pretty slow. I don't want to do a bitwise implementation yet, I want to understand if this implementation offers any benefits.

或者如果我错过了实现重点.

Or if I missed an implementation point.

伪代码中的内部 for 循环在这里看起来很有趣/很奇怪

The inner for loop in the pseudocode here looks interesting / odd

https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

我不知道该怎么解释. ( update: OP似乎在注释中指出,这是从Wikipedia复制粘贴伪代码后格式不正确的问题,并且格式正确后现在很清楚)

I don't know how to interpret it. (update: the OP seems to indicate in the comments that it was an issue with incorrect formatting after copy-pasting the pseudocode from Wikipedia, and with the corrected formatting it is clear now)

这里是:

算法灵异筛网:

input: an integer n > 1.
output: all prime numbers from 2 through n.
let A be an array of Boolean values, indexed by integers 2 to n, initially all set to true.
for i = 2, 3, 4, ..., not exceeding √n do if A[i] is true for j = i2, i2+i, i2+2i, i2+3i, ..., not exceeding n do A[j] := false
return all i such that A[i] is true.

// prime-2
// 2 optimizations - odds and square root
function prime2(n){
  const primes = [2];
  not_prime: for(let i = 3; i < n; i += 2){
    for(let j = 2; j * j <= i; j++){
      if(i % j === 0){
        continue not_prime;
      }
    }
    primes.push(i);
  }
  return primes;
}

// prime-3
// sieve implementation
function prime3 (n) {
  const primes = [];
  const sieve = (new Array(n)).fill(true);
  for (let i = 2; i * i <= n; i += 1) {
    if (sieve[i]) {
      for (let j = i + i; j < n; j += i) {
        sieve[j] = false;
      }
    }
  }
  makePrimes(sieve, primes, n);
  return primes;
};
function makePrimes(sieve, primes, n){
  for (let i = 2; i < n; i++) {
    if(sieve[i]) {
      primes.push(i);
    }
  }
}

推荐答案

您所看到的是理论运行时复杂度差异的表达,即两种算法之间的真正算法差异.

What you see is an expression of the differences in theoretical run time complexities, i.e. the true algorithmic differences between the two algorithms.

最佳试验分割筛的复杂度为 O(n 1.5 /(log n) 2 ) (*)筛网的复杂性为 O(n log log n ).

Optimal trial division sieve's complexity is O(n1.5/(log n)2)(*) whereas the sieve of Eratosthenes' complexity is O(n log log n).

根据 Scott Sauyet 发布的经验运行时间数据stackoverflow.com/questions/60378812/why-does-my-sieve-not-perform-well-for-finding-primes#comment106879316_60378812>评论,

According to the empirical run time figures posted by Scott Sauyet in the comments,

   1e6      279ms      36ms
   1e7     6946ms     291ms
   -------------------------
   n^       1.40       0.91

经验增长顺序 大致〜n 1.4 和〜n 在测量范围内,很合适.

the empirical orders of growth are roughly ~n1.4 and ~n in the measured range, which is a good fit.

因此,您的 真正 筛子的性能很好. 审判分庭 筛子的效果与预期相同.如果我们增加问题的大小,那么代码的算法性质将永远胜过任何二次优化的存在或不存在.

So your genuine sieve does perform well. The trial division sieve performs as expected. The algorithmic nature of a code will always beat any presence or absence of any secondary optimizations, if we increase the problem size enough.

仅通过一个问题大小的点进行测量来比较性能,永远是不够的.因此,即使您看到的与较简单的产品"仅相差10%,但如果您以更大的尺寸进行测试,则差异也会更大.

And comparing performances by measuring them at just one problem-size point is never enough. So even if you see just 10% difference over the "simpler one", if you test at bigger sizes, the difference will be bigger.

如果您想获得一些代码方面的进一步改进的指针,请注意,对于初学者来说,您是从i+i而不是从i*i开始内部循环的.

If you want some pointers about what can be further improved in your code, do note that you start the inner loop from i+i instead of from i*i, for starters.

另一种常见的优化方法是特殊情况下的 2 ,从 3 开始,将候选值增加 2 ,并使用内部循环增量 2*i 而不是 i ,可实现2倍的即时加速.这是 车轮因式分解 优化的最简单形式,可以进一步应用,且收益递减尽管每增加一个素数.但是使用2-3-5-7是很常见的,如果有内存的话,应该再提高2倍的速度.

Another common optimization is to special-case 2, start from 3 and increment the candidates by 2 and use the inner loop increment of 2*i instead of just i, to achieve instant 2x speedup. This is the simplest form of wheel factorization optimization, which can be further applied, with diminishing returns though for each additional prime. But using 2-3-5-7 is common and should give about another 2x speedup, if memory serves.

最后但并非最不重要的是,做到

Last but not least, make it segmented.

(*) π(n) * π(√n) 来自素数,仅来自复合词.

这篇关于为什么我的筛子在寻找素数时表现不佳?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆