优化 R 中的质数查找器和 CPU 速度检查器 [英] Optimize prime number finder and cpu speed checker in R
问题描述
我有以下代码可以在 10 秒内找到质数:
I have following code to find prime numbers in 10 seconds:
prime_nums = function (){
ptm <- proc.time()
p_nums = c(2)
counter = 2
while (TRUE){
isPRIME = FALSE
counter = counter +1
for(n in p_nums) {
if(n > sqrt(counter)){ isPRIME=TRUE; break; }
if(counter %% n == 0){ isPRIME = FALSE; break;}
}
if(isPRIME) { p_nums[length(p_nums)+1]=counter ; cat("",counter,";")}
if((proc.time()[3]-ptm[3]) > 10) break;
}
}
然而,这是用许多循环编写的,这些循环在 R 中通常不是首选.如何优化此代码以使其尽可能快?
However, this is written with many loops which are generally not preferred in R. How can I optimize this code to become as fast as possible?
我发现以下代码最快:
prime_nums_fastest = function (){
ptm <- proc.time()
p_nums = c(2L,3L,5L,7L)
counter = 7L
while (TRUE){
isPRIME = FALSE
counter = counter +2L
loc = 4*sqrt(counter)/log(counter,2)
isPRIME = !any(0 == (counter %% p_nums[1:loc]))
if(isPRIME) { p_nums[length(p_nums)+1]=counter }
if((proc.time()[3]-ptm[3]) > 10) break;
}
print(p_nums)
}
保留初始小素数以简化.使用 2*sqrt.. 甚至 3*sqrt... 作为 loc 参数会导致包含非素数.与使用 1:sqrt(counter) 相比,需要检查的素数要少得多.
Initial small primes are kept to simplify. Using 2*sqrt.. or even 3*sqrt... for loc parameter leads to inclusion of non-primes. Significantly less primes need to be checked than using 1:sqrt(counter).
推荐答案
摆脱 cat
命令.这太贵了.有了它,我得到了 384239.返回质数向量反而让我得到了 471617,这是一个显着的改进.
Get rid of the cat
command. That's expensive. With it in place, I get to 384239. Returning the vector of primes instead gets me to 471617, a significant improvement.
改变 n >sqrt(counter)
到 n*n >counter
让我得到 477163,一个小小的改进.
Changing n > sqrt(counter)
to n*n > counter
gets me to 477163, a small improvement.
将 p_nums
和 counter
更改为 integer
类型使我得到 514859,这是另一个小改进.这是通过修改定义和调整这些行的行来实现的:
Changing p_nums
and counter
to be of type integer
gets me to 514859, another small improvement. This is achieved by modifying the lines where these are defined and adjusted:
p_nums = c(2L)
counter = 2L
# ... and inside the loop:
counter = counter +1L
请注意,您可以使用如下代码对确定某个值为素数的循环进行矢量化:
Note that you can vectorize the loop which determines that a value is prime, with code such as this:
isPRIME = !any(0 == (counter %% p_nums[1:sqrt(counter)]))
使用它代替 for
让我得到 451249,这是一个显着的回归(不使用 cat
并使用整数算法).这是因为 R 没有惰性列表评估,所以对每个值都取模数,然后对 0 进行测试.在这种情况下,这是 for
的一个优点.
Using that instead of for
gets me to 451249, a significant regression (using no cat
and using integer arithmetic). This is because R does not have lazy list evaluation, so the modulus is taken for every value, then they are tested against 0. This is an advantage of for
in this case.
这篇关于优化 R 中的质数查找器和 CPU 速度检查器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!