基准math.h平方根和Quake平方根 [英] Benchmarking math.h square root and Quake square root

查看:206
本文介绍了基准math.h平方根和Quake平方根的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好吧,所以我是董事会,想知道math.h平方根是多么快与一个有魔法数字在它(由着名的地震,但由SGI制造)。

Okay so I was board and wondered how fast math.h square root was in comparison to the one with the magic number in it (made famous by Quake but made by SGI).

但是,这对我来说是一个伤害的世界。

But this has ended up in a world of hurt for me.

我在Mac上试过这个,math.h会赢得每个

I first tried this on the Mac where the math.h would win hands down every time then on Windows where the magic number always won, but I think this is all down to my own noobness.


  1. 正在编译的时候,具有g ++ -o sq_root sq_root_test.cpp的Mac当程序运行它需要大约15秒完成。但是在VS2005上的编译在发布时需要一秒钟。 (实际上我不得不编译调试只是为了显示一些数字)

  1. Compiling on the Mac with "g++ -o sq_root sq_root_test.cpp" when the program ran it takes about 15 seconds to complete. But compiling in VS2005 on release takes a split second. (in fact I had to compile in debug just to get it to show some numbers)

我的穷人的基准测试?这真的很蠢吗? cos我得到0.01为math.h和0为魔法数字。 (它不能那么快吗?)

My poor man's benchmarking? is this really stupid? cos I get 0.01 for math.h and 0 for the Magic number. (it cant be that fast can it?)

我不知道这是否重要,但Mac是Intel而PC是AMD。是Mac使用math.h sqroot的硬件吗?

I don't know if this matters but the Mac is Intel and the PC is AMD. Is the Mac using hardware for math.h sqroot?

我从 http://en.wikipedia.org/wiki/Fast_inverse_square_root

//sq_root_test.cpp

#include <iostream>
#include <math.h>
#include <ctime>


float invSqrt(float x)
{
    union {
        float f;
        int i;
    } tmp;
    tmp.f = x;
    tmp.i = 0x5f3759df - (tmp.i >> 1);
    float y = tmp.f;
    return y * (1.5f - 0.5f * x * y * y);
}

int main() {
    std::clock_t start;// = std::clock();
    std::clock_t end;
    float rootMe;

    int iterations = 999999999;

    // ---

    rootMe = 2.0f;
    start = std::clock();

    std::cout << "Math.h SqRoot: ";

    for (int m = 0; m < iterations; m++) {
        (float)(1.0/sqrt(rootMe));
        rootMe++;
    }

    end = std::clock();

    std::cout << (difftime(end, start)) << std::endl;

    // ---

    std::cout << "Quake SqRoot: ";

    rootMe = 2.0f;
    start = std::clock();

    for (int q = 0; q < iterations; q++) {
        invSqrt(rootMe);
        rootMe++;
    }

    end = std::clock();

    std::cout << (difftime(end, start)) << std::endl;   
}


推荐答案

基准测试。首先,你的基准包括一个潜在昂贵的从int浮动。

There are several problems with your benchmarks. First, your benchmark includes a potentially expensive cast from int to float. If you want to know what a square root costs, you should benchmark square roots, not datatype conversions.

其次,整个基准测试可以是(而且)由编译器优化,因为它没有可观察到的副作用。您不使用返回的值(或将其存储在易失性内存位置),因此编译器会看到可以跳过整个事情

Second, your entire benchmark can be (and is) optimized out by the compiler because it has no observable side effects. You don't use the returned value (or store it in a volatile memory location), so the compiler sees that it can skip the whole thing.

这里的一个线索是,你必须禁用优化。这意味着你的基准代码坏了。从未在基准化时停用优化。你想知道哪个版本运行最快,所以你应该在实际使用的条件下测试它。如果您要在性能敏感的代码中使用方根,那么您将启用优化,因此无需进行优化即可实现优化。

A clue here is that you had to disable optimizations. That means your benchmarking code is broken. Never ever disable optimizations when benchmarking. You want to know which version runs fastest, so you should test it under the conditions it'd actually be used under. If you were to use square roots in performance-sensitive code, you'd enable optimizations, so how it behaves without optimizations is completely irrelevant.

此外,您不是计算平方根,而是倒数平方根的成本的基准。
如果你想知道计算平方根的哪种方法是最快的,你必须将 1.0 /...分区移动到Quake版本。 (因为分割是一个非常昂贵的操作,这可能会使你的结果有很大的不同)

Also, you're not benchmarking the cost of computing a square root, but of the inverse square root. If you want to know which way of computing the square root is fastest, you have to move the 1.0/... division down to the Quake version. (And since division is a pretty expensive operation, this might make a big difference in your results)

最后,值得指出的是,Carmacks的小伎俩被设计为在12岁的计算机上快。一旦你修正了你的基准,你可能会发现它不再是一个优化,因为今天的CPU是更快的计算真正的平方根。

Finally, it might be worth pointing out that Carmacks little trick was designed to be fast on 12 year old computers. Once you fix your benchmark, you'll probably find that it's no longer an optimization, because today's CPU's are much faster at computing "real" square roots.

这篇关于基准math.h平方根和Quake平方根的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆