性能统计为每次运行提供不同数量的指令 [英] perf stat gives different number of instruction for every run

查看:120
本文介绍了性能统计为每次运行提供不同数量的指令的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对以下空程序进行了性能分析,

  #include< stdio.h> 
int main(){
}

编译并运行perf stat之后。 /a.out我得到以下输出信息(以及其他数据,例如周期数,任务时钟等):

  418,869条指令#每个周期0.87个insns小精灵。 



我的实际需求是查找我编写的特定函数中的指令数。因此,我将从新程序中的指令数中减去上述数字。(我可以计算在gcc中使用-S标记创建的program.s中的行数,但在查看性能表现后感到困惑)



为什么指令数量不一致,确切地说是不相同?



更新 >
我按照手册页中给出的示例使用在C

解决方案

要测量函数执行的指令数量,建议使用 perf_event_open()在函数的入口和出口处运行,而不是在有或无时都运行两次程序



关于由空程序执行的指令数量的不确定性,您可能正在数用户和内核领域的事件。我认为两次运行之间的用户土地数量应该保持不变,但是对于内核而言,执行此程序的幕后发生了很多事情,因此我想不确定性来自内核代码中发生的事情。要仅计算用户空间指令,可以使用:

 性能统计-e指令:u a.out 

能否提供有关差异的更多详细信息?


I ran perf analysis on the following empty program,

#include <stdio.h>
int main() {
}

After compiling and running perf stat ./a.out I got the following output saying (along with other data like number of cycles, task-clock etc):

418,869 instructions # 0.87 insns per cycle

The number of instructions changes during every 'perf' analysis on the same elf.

My actual need is to find the number of instructions in a particular function I wrote. So I will be subtracting the above number from the number of instructions in the new program.(I could count the number of line in program.s created using -S tag in gcc but I'm confused after looking at perf behaviour)

Why is the number of instructions not consistent, to be exact not same?

Update I followed the example given in man page to use perf_event_open() in C

解决方案

To measure the number of instructions executed by your function, I suggest to start and stop events counting with perf_event_open() at function's entry and exit rather than running twice your program with and without the function.

Regarding the non determinism of the number of instructions executed by your empty program, you are may be counting events in both user and kernel lands. I think that the user land count should stay the same between two runs, but for kernel part, many thing are happening behind the scene to execute this program, so I guess the non determinism comes from within what is happening in kernel code. To count only user space instructions you can use:

perf stat -e instructions:u a.out

Can you give more details about the differences ?

这篇关于性能统计为每次运行提供不同数量的指令的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆