获得 π 值的最快方法是什么? [英] What is the fastest way to get the value of π?

查看:33
本文介绍了获得 π 值的最快方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为个人挑战,我正在寻找获得 π 值的最快方法.更具体地说,我使用的方法不涉及使用 #define 常量(如 M_PI),或对数字进行硬编码.

下面的程序测试了我所知道的各种方法.理论上,内联汇编版本是最快的选择,但显然不可移植.我已将其作为基线与其他版本进行比较.在我的测试中,使用内置函数,4 * atan(1) 版本在 GCC 4.2 上是最快的,因为它自动将 atan(1) 折叠成一个常量.指定 -fno-builtin 后,atan2(0, -1) 版本最快.

这是主要的测试程序(pitimes.c):

#include #include #include #define ITERS 10000000#define TESTWITH(x) { 差异 = 0.0;时间 1 = 时钟();对于 (i = 0; i  %e, time => %f
", #x, diff, diffclock(time2, time1));}静态内联双精度diffclock(clock_t time1,clock_t time0){return (double) (time1 - time0)/CLOCKS_PER_SEC;}整数主要的(){国际我;clock_t time1, time2;双重差异;/* 暖身.atan2 案例捕获 GCC 的 atan 折叠(这将* 将 ``4 * atan(1) - M_PI'' 优化为无操作),如果 -fno-builtin* 未使用.*/测试(4 * atan(1))测试(4 * atan2(1, 1))#if 已定义(__GNUC__) &&(已定义(__i386__) || 已定义(__amd64__))外部双 fldpi();测试(fldpi())#万一/* 实际测试从这里开始.*/测试(atan2(0,-1))测试(acos(-1))测试(2 * asin(1))测试(4 * atan2(1, 1))测试(4 * atan(1))返回0;}

以及仅适用于 x86 和 x64 系统的内联汇编内容 (fldpi.c):

doublefldpi(){双圆周率;asm("fldpi" : "=t" (pi));返回圆周率;}

以及构建我正在测试的所有配置的构建脚本 (build.sh):

#!/bin/shgcc -O3 -Wall -c -m32 -o fldpi-32.o fldpi.cgcc -O3 -Wall -c -m64 -o fldpi-64.o fldpi.cgcc -O3 -Wall -ffast-math -m32 -o pitimes1-32 pitimes.c fldpi-32.ogcc -O3 -Wall -m32 -o pitimes2-32 pitimes.c fldpi-32.o -lmgcc -O3 -Wall -fno-builtin -m32 -o pitimes3-32 pitimes.c fldpi-32.o -lmgcc -O3 -Wall -ffast-math -m64 -o pitimes1-64 pitimes.c fldpi-64.o -lmgcc -O3 -Wall -m64 -o pitimes2-64 pitimes.c fldpi-64.o -lmgcc -O3 -Wall -fno-builtin -m64 -o pitimes3-64 pitimes.c fldpi-64.o -lm

除了在各种编译器标志之间进行测试(我也比较了 32 位和 64 位,因为优化不同),我还尝试切换测试的顺序.但是,atan2(0, -1) 版本仍然每次都名列前茅.

解决方案

.维基百科提到,当 ab 足够接近"时,(a + b)²/4t 将是 π 的近似值.我不确定足够接近"是什么意思,但从我的测试来看,一次迭代得到 2 位数,两次得到 7,三次得到 15,当然这是双打,所以根据它的表示它可能有错误和true 计算可能更准确.

让 pi_2 iters =让 rec loop_ a b t p i =如果 i = 0 那么 a,b,t,p别的让 a_n = (a +.b)/.2.0和 b_n = sqrt (a*.b)并且 p_n = 2.0 *.别针让 t_n = t -.(p *. (a -. a_n) *. (a -. a_n)) 在loop_ a_n b_n t_n p_n (i - 1)在让 a,b,t,p = loop_ (1.0) (1.0/. (sqrt 2.0)) (1.0/.4.0) (1.0) 迭代(a +. b) *.(a +. b)/.(4.0 *.t)

最后,来点圆周率高尔夫(800 位数)怎么样?160 个字符!

int a=10000,b,c=2800,d,e,f[2801],g;main(){for(;bc;)f[b++]=a/5;for(;d=0,g=c*2;c-=14,printf("%.4d",e+d/a),e=d%a)for(b=c;d+=f[b]*a,f[b]=d%--g,d/=g--,--b;d*=b);}

I'm looking for the fastest way to obtain the value of π, as a personal challenge. More specifically, I'm using ways that don't involve using #define constants like M_PI, or hard-coding the number in.

The program below tests the various ways I know of. The inline assembly version is, in theory, the fastest option, though clearly not portable. I've included it as a baseline to compare against the other versions. In my tests, with built-ins, the 4 * atan(1) version is fastest on GCC 4.2, because it auto-folds the atan(1) into a constant. With -fno-builtin specified, the atan2(0, -1) version is fastest.

Here's the main testing program (pitimes.c):

#include <math.h>
#include <stdio.h>
#include <time.h>

#define ITERS 10000000
#define TESTWITH(x) {                                                       
    diff = 0.0;                                                             
    time1 = clock();                                                        
    for (i = 0; i < ITERS; ++i)                                             
        diff += (x) - M_PI;                                                 
    time2 = clock();                                                        
    printf("%s	=> %e, time => %f
", #x, diff, diffclock(time2, time1));   
}

static inline double
diffclock(clock_t time1, clock_t time0)
{
    return (double) (time1 - time0) / CLOCKS_PER_SEC;
}

int
main()
{
    int i;
    clock_t time1, time2;
    double diff;

    /* Warmup. The atan2 case catches GCC's atan folding (which would
     * optimise the ``4 * atan(1) - M_PI'' to a no-op), if -fno-builtin
     * is not used. */
    TESTWITH(4 * atan(1))
    TESTWITH(4 * atan2(1, 1))

#if defined(__GNUC__) && (defined(__i386__) || defined(__amd64__))
    extern double fldpi();
    TESTWITH(fldpi())
#endif

    /* Actual tests start here. */
    TESTWITH(atan2(0, -1))
    TESTWITH(acos(-1))
    TESTWITH(2 * asin(1))
    TESTWITH(4 * atan2(1, 1))
    TESTWITH(4 * atan(1))

    return 0;
}

And the inline assembly stuff (fldpi.c) that will only work for x86 and x64 systems:

double
fldpi()
{
    double pi;
    asm("fldpi" : "=t" (pi));
    return pi;
}

And a build script that builds all the configurations I'm testing (build.sh):

#!/bin/sh
gcc -O3 -Wall -c           -m32 -o fldpi-32.o fldpi.c
gcc -O3 -Wall -c           -m64 -o fldpi-64.o fldpi.c

gcc -O3 -Wall -ffast-math  -m32 -o pitimes1-32 pitimes.c fldpi-32.o
gcc -O3 -Wall              -m32 -o pitimes2-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -fno-builtin -m32 -o pitimes3-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -ffast-math  -m64 -o pitimes1-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall              -m64 -o pitimes2-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall -fno-builtin -m64 -o pitimes3-64 pitimes.c fldpi-64.o -lm

Apart from testing between various compiler flags (I've compared 32-bit against 64-bit too because the optimizations are different), I've also tried switching the order of the tests around. But still, the atan2(0, -1) version still comes out on top every time.

解决方案

The Monte Carlo method, as mentioned, applies some great concepts but it is, clearly, not the fastest, not by a long shot, not by any reasonable measure. Also, it all depends on what kind of accuracy you are looking for. The fastest π I know of is the one with the digits hard coded. Looking at Pi and Pi[PDF], there are a lot of formulae.

Here is a method that converges quickly — about 14 digits per iteration. PiFast, the current fastest application, uses this formula with the FFT. I'll just write the formula, since the code is straightforward. This formula was almost found by Ramanujan and discovered by Chudnovsky. It is actually how he calculated several billion digits of the number — so it isn't a method to disregard. The formula will overflow quickly and, since we are dividing factorials, it would be advantageous then to delay such calculations to remove terms.

where,

Below is the Brent–Salamin algorithm. Wikipedia mentions that when a and b are "close enough" then (a + b)² / 4t will be an approximation of π. I'm not sure what "close enough" means, but from my tests, one iteration got 2 digits, two got 7, and three had 15, of course this is with doubles, so it might have an error based on its representation and the true calculation could be more accurate.

let pi_2 iters =
    let rec loop_ a b t p i =
        if i = 0 then a,b,t,p
        else
            let a_n = (a +. b) /. 2.0 
            and b_n = sqrt (a*.b)
            and p_n = 2.0 *. p in
            let t_n = t -. (p *. (a -. a_n) *. (a -. a_n)) in
            loop_ a_n b_n t_n p_n (i - 1)
    in 
    let a,b,t,p = loop_ (1.0) (1.0 /. (sqrt 2.0)) (1.0/.4.0) (1.0) iters in
    (a +. b) *. (a +. b) /. (4.0 *. t)

Lastly, how about some pi golf (800 digits)? 160 characters!

int a=10000,b,c=2800,d,e,f[2801],g;main(){for(;b-c;)f[b++]=a/5;for(;d=0,g=c*2;c-=14,printf("%.4d",e+d/a),e=d%a)for(b=c;d+=f[b]*a,f[b]=d%--g,d/=g--,--b;d*=b);}

这篇关于获得 π 值的最快方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆