什么是微基准测试? [英] What is microbenchmarking?

查看:37
本文介绍了什么是微基准测试?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我听说过使用这个术语,但我不完全确定它的意思,所以:

I've heard this term used, but I'm not entirely sure what it means, so:

  • 它是什么意思,它不是什么意思?
  • 有哪些是微基准测试和非微基准测试的示例?
  • 微基准测试有哪些危险?如何避免?
    • (或者这是一件好事?)

    推荐答案

    它的意思正是它在罐头上所说的 - 它正在测量小"事物的性能,例如对操作系统内核的系统调用.

    It means exactly what it says on the tin can - it's measuring the performance of something "small", like a system call to the kernel of an operating system.

    危险在于人们可能会使用他们从微基准测试中获得的任何结果来指示优化.众所周知:

    The danger is that people may use whatever results they obtain from microbenchmarking to dictate optimizations. And as we all know:

    我们应该忘记小效率,比如大约 97% 的时间:过早的优化是万恶"——唐纳德·克努斯

    We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil" -- Donald Knuth

    可能有许多因素会影响微基准测试的结果.编译器优化就是其中之一.如果被测量的操作花费的时间太少,以至于你用来测量它的时间比实际操作本身要长,那么你的微基准也会出现偏差.

    There can be many factors that skew the result of microbenchmarks. Compiler optimizations is one of them. If the operation being measured takes so little time that whatever you use to measure it takes longer than the actual operation itself, your microbenchmarks will be skewed also.

    例如,有人可能会对 for 循环的开销进行微基准测试:

    For example, someone might take a microbenchmark of the overhead of for loops:

    void TestForLoop()
    {
        time start = GetTime();
    
        for(int i = 0; i < 1000000000; ++i)
        {
        }
    
        time elapsed = GetTime() - start;
        time elapsedPerIteration = elapsed / 1000000000;
        printf("Time elapsed for each iteration: %d
    ", elapsedPerIteration);
    }
    

    显然编译器可以看到循环完全不做任何事情并且根本不为循环生成任何代码.所以 elapsedelapsedPerIteration 的值几乎没用.

    Obviously compilers can see that the loop does absolutely nothing and not generate any code for the loop at all. So the value of elapsed and elapsedPerIteration is pretty much useless.

    即使循环做了一些事情:

    Even if the loop does something:

    void TestForLoop()
    {
        int sum = 0;
        time start = GetTime();
    
        for(int i = 0; i < 1000000000; ++i)
        {
            ++sum;
        }
    
        time elapsed = GetTime() - start;
        time elapsedPerIteration = elapsed / 1000000000;
        printf("Time elapsed for each iteration: %d
    ", elapsedPerIteration);
    }
    

    编译器可能会发现变量 sum 不会被用于任何事情并优化它,并优化掉 for 循环.可是等等!如果我们这样做会怎样:

    The compiler may see that the variable sum isn't going to be used for anything and optimize it away, and optimize away the for loop as well. But wait! What if we do this:

    void TestForLoop()
    {
        int sum = 0;
        time start = GetTime();
    
        for(int i = 0; i < 1000000000; ++i)
        {
            ++sum;
        }
    
        time elapsed = GetTime() - start;
        time elapsedPerIteration = elapsed / 1000000000;
        printf("Time elapsed for each iteration: %d
    ", elapsedPerIteration);
        printf("Sum: %d
    ", sum); // Added
    }
    

    编译器可能足够聪明,可以意识到 sum 将始终是一个常量值,并将所有这些都优化掉.如今,许多人会对编译器的优化能力感到惊讶.

    The compiler might be smart enough to realize that sum will always be a constant value, and optimize all that away as well. Many would be surprised at the optimizing capabilities of compilers these days.

    但是编译器无法优化的东西呢?

    But what about things that compilers can't optimize away?

    void TestFileOpenPerformance()
    {
        FILE* file = NULL;
        time start = GetTime();
    
        for(int i = 0; i < 1000000000; ++i)
        {
            file = fopen("testfile.dat");
            fclose(file);
        }
    
        time elapsed = GetTime() - start;
        time elapsedPerIteration = elapsed / 1000000000;
        printf("Time elapsed for each file open: %d
    ", elapsedPerIteration);
    }
    

    即使这也不是一个有用的测试!操作系统可能会发现该文件被非常频繁地打开,因此它可能会将其预加载到内存中以提高性能.几乎所有的操作系​​统都这样做.当您打开应用程序时也会发生同样的事情 - 操作系统可能会找出您打开最多的前 5 个应用程序,并在您启动计算机时将应用程序代码预加载到内存中!

    Even this is not a useful test! The operating system may see that the file is being opened very frequently, so it may preload it in memory to improve performance. Pretty much all operating systems do this. The same thing happens when you open applications - operating systems may figure out the top ~5 applications you open the most and preload the application code in memory when you boot up the computer!

    事实上,有无数变量在起作用:引用的位置(例如数组与链表)、缓存和内存带宽的影响、编译器内联、编译器实现、编译器切换、处理器内核数量、优化处理器级别、操作系统调度程序、操作系统后台进程等

    In fact, there are countless variables that come into play: locality of reference (e.g. arrays vs. linked lists), effects of caches and memory bandwidth, compiler inlining, compiler implementation, compiler switches, number of processor cores, optimizations at the processor level, operating system schedulers, operating system background processes, etc.

    因此,在很多情况下,微基准测试并不是一个有用的指标.它绝对不会用定义明确的测试用例(分析)代替整个程序的基准测试.先写可读的代码,然后分析看看需要做什么(如果有的话).

    So microbenchmarking isn't exactly a useful metric in a lot of cases. It definitely does not replace whole-program benchmarks with well-defined test cases (profiling). Write readable code first, then profile to see what needs to be done, if any.

    我想强调的是,微基准测试本身并不是邪恶的,但必须谨慎使用它们(这适用于与计算机相关的许多其他事物)

    I would like to emphasize that microbenchmarks are not evil per se, but one has to use them carefully (that's true for lots of other things related to computers)

    这篇关于什么是微基准测试?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆