如何做到的复杂功能良好的基准? [英] How to do good benchmarking of complex functions?

查看:81
本文介绍了如何做到的复杂功能良好的基准?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对在C.一组复杂的功能这是科技水平的细节非常详细的基准着手。我想知道,什么是做严肃的标杆的最佳方式?我在想运行它们,比方说,每次10次,场均时间的结果,并给予标准开发,例如,仅仅使用<&time.h中GT; 。你会怎么做球员获得良好的基准?

I am about to embark in very detailed benchmarking of a set of complex functions in C. This is "science level" detail. I'm wondering, what would be the best way to do serious benchmarking? I was thinking about running them, say, 10 times each, averaging the timing results and give the standard dev, for instance, just using <time.h>. What would you guys do to obtain good benchmarks?

推荐答案

报告的平均值和标准偏差给出了一个分布的一个很好的说明当有问题的分布大约是正常的。然而,这是计算的性能测量很少如此。相反,性能测量结果往往更接近于泊松分布。这是有道理的,因为在计算机上没有很多随机事件会导致程序走得更快;基本上所有的测量噪声是许多随机事件如何发生导致它慢下来。 (正态分布,相比之下,使的完全没有直观的感觉的;它需要的信念,一个程序有负的时间整理的概率不为零)。

Reporting an average and standard deviation gives a good description of a distribution when the distribution in question is approximately normal. However, this is rarely true of computational performance measurements. Instead, performance measurements tend to more closely resemble a poisson distribution. This makes sense, because not many random events on a computer will cause a program to go faster; essentially all of the measurement noise is in how many random events occur that cause it to slow down. (A normal distribution, by contrast, makes no intuitive sense at all; it would require the belief that a program has a non-zero probability of finishing in negative time).

在有鉴于此,我觉得最有用的报告的最小的时间超过一个程序,而不是平均的多次运行;在分配的噪声通常是测量系统的噪声,而不是关于算法有意义的信息。对于复杂的算法,已经提前出条件,和其他快捷方式,你需要多一点谨慎,但很多运行的每个运行处理的投入重新presentative平衡的最低通常效果很好。

In light of this, I find it most useful to report the minimum time over many runs of a program, rather than the average; the noise in the distribution is typically noise of the measuring system, rather than meaningful information about the algorithm. For complex algorithms that have early out conditions, and other shortcuts, you need to be a little more careful, but the minimum of many runs where each run handles a representative balance of inputs usually works well.

10次,每次听起来像的非常的几个迭代给我。我一般做一些十万量级(或更多,这取决于功能/系统)运行的,除非那是完全不可行的。在最低限度,你需要确保你运行时间足够长,以摇出的系统状态的任何依赖性,其中一些可能在相当大的时间粒度改变。

"10 times each" sounds like very few iterations to me. I generally do something on the order of thousands (or more, depending on the function/system) of runs unless that's completely infeasible. At a bare minimum, you need to make sure that you run the timing for sufficiently long as to shake out any dependence on system state, some of which may change at fairly large time granularity.

你应该知道的另一件事是,基本上每个系统都有提供一个平台特定的计时器是远远超过什么是可用&LT更准确; time.h中&GT; 。找出它是你的目标平台[S]什么,并用它来代替。

The other thing you should be aware of is that essentially every system has a platform-specific timer available that is much more accurate than what is available <time.h>. Find out what it is on your target platform[s] and use it instead.

这篇关于如何做到的复杂功能良好的基准?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆