使用GPerf工具:不起作用,重定向是否有问题? [英] Using GPerf Tool: not working, issue with redirection?

查看:103
本文介绍了使用GPerf工具:不起作用,重定向是否有问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试分析我的软件以对其进行优化.

I'm trying to profile my software in order to optimize it.

我将gprof与编译标志-g -pg -O3一起使用,但是结果不能给我足够的精度.

I used gprof with the compilation flag -g -pg -O3 but the result are not giving me enough precision.

这是我的编译堆栈跟踪:

Here is my Stacktrace of compilation:

$: make clean; make;

rm -f ./obj/*.o
rm -f ./bin/mdk-verifier
rm -f ./grammar/modal.output
rm -f ./grammar/modal.tab.h
rm -f ./grammar/*.cpp
rm -f ./lex.backup

bison -d -t -l -v -o ./grammar/modal.tab.c ./grammar/modal.y && mv ./grammar/modal.tab.c ./grammar/modal.tab.cpp
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./grammar/modal.tab.cpp -o ./obj/modal.tab.o
flex -l -b -o./grammar/lex.yy.cpp ./grammar/modal.lex   
g++ -O3 -g -pg -I./include -c ./grammar/lex.yy.cpp -o ./obj/lex.yy.o
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./src/Kripke.cc -o ./obj/Kripke.o   
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./src/Term.cc -o ./obj/Term.o 
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./src/BooleanConstant.cc -o ./obj/BooleanConstant.o 
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./src/Variable.cc -o ./obj/Variable.o 
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./src/PropositionalVariable.cc -o ./obj/PropositionalVariable.o 
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./src/Operation.cc -o ./obj/Operation.o 
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./src/BooleanOperation.cc -o ./obj/BooleanOperation.o 
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./src/ModalOperation.cc -o ./obj/ModalOperation.o   
g++ -O3 -g -pg -fPIC -std=c++11  -I./include -c ./src/Formula.cc -o ./obj/Formula.o 
g++ -O3 -g -pg -fPIC -std=c++11  -o ./obj/Main.o -c ./src/Main.cc 
g++ -O3 -g -pg -static -lprofiler -o ./bin/mdk-verifier ./obj/modal.tab.o ./obj/lex.yy.o ./obj/Kripke.o ./obj/Term.o ./obj/BooleanConstant.o ./obj/PropositionalVariable.o ./obj/Variable.o ./obj/Operation.o ./obj/BooleanOperation.o ./obj/ModalOperation.o ./obj/Formula.o ./obj/Main.o               

这是我调用程序的方式:

And here is how I call my program:

$: ./bin/mdk-verifier ./problem.txt < solution.txt 

因此,执行后,一切都很好,我得到了gmon.out文件.我正在执行命令gprof ./bin/mdk-verifier | more,并且得到以下结果:

So after execution, everything is fine, I get a gmon.out file. I'm executing the command gprof ./bin/mdk-verifier | more and I get the following results:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 34.00      2.13     2.13       18   118.33   118.33  ModalOperation::checkBranch(Kripke&, unsigned int)
  ...
  ...
  5.91      4.98     0.37 54684911     0.00     0.00  BooleanOperation::checkBranch(Kripke&, unsigned int)
  4.63      5.27     0.29 54684911     0.00     0.00  PropositionalVariable::checkBranch(Kripke&, unsigned int)

很明显,对ModalOperation :: checkBranch的调用计数已溢出...并且每次进入此函数时都进行显示,因此我确实进行了18次以上的调用...

And obviously, the count of calls for ModalOperation::checkBranch overflowed... and by making a display everytime I'm entering this function, I indeed made more than 18 calls...

因此,我想到了使用另一个更精确的探查器,并找到了Google的 GPerfTools .

So I thought about using another profiler, more precise and I found GPerfTools by Google.

我想使用它,我安装在Ubuntu上:

I wanted to use it, I installed on my Ubuntu:

  • libgoogle-perftools-dev
  • google-perftools

并遵循教程,他们要求我设置环境变量CPUPROFILE

and by following the tutorial, They asked me to set the environment variable CPUPROFILE

我做到了,我得到了:

 $: env | grep "CPU"
 CPUPROFILE=./prof.out

我也将-lprofiler放在可执行文件的链接过程中,所以我认为一切正常,并且可以开始分析文件./prof.out

I also put -lprofiler during the linking of my executable, So I thought that everything was okay and that I could start profiling the data in the file ./prof.out

但是不幸的是,这个文件没有出现...什么都没有创建,所以我无法剖析任何东西...

But unfortunately, this file is not appearing... Nothing is created, so I can't profile anything...

有人知道为什么不创建./prof.out文件以及为什么分析不收集数据吗?

Does anyone has an idea about why the ./prof.out file is not created and why the profiling is not gathering data ?

在此先感谢您的帮助!

最好的问候;

推荐答案

您的目的是节省软件时间.多个问题,首先是负面因素:

Your purpose is to save time in your software. Multiple issues, first the negatives:

  • -O3:编译器可以优化某些内容.它无法优化只有您才能优化的事物. 所能做的就是通过加扰代码使它们很难被找到. -O3的使用时间是在您找到并修复了您可以的功能之后的.

  • -O3: The compiler can optimize certain things. It cannot optimize the things that only you can optimize. What it can do is make them hard to find, by scrambling the code. The time to use -O3 is after you've found and fixed what you can.

gprof令人尊敬,但仅此而已.它对程序计数器进行采样并对函数调用进行计数. 以下是与此有关的问题列表. 它确实可以为您提供调用图,但是 加速可以很容易地隐藏在其中 .

gprof is venerable, but little more. It samples the program counter and counts function calls. Here is a list of problems with that. It does give you a call graph, but speedups can easily hide in that.

gperftools更好(已针对Aliaksei的评论进行了修订),因为它是真正的堆栈采样器.通常,它是"CPU分析器",在这种模式下,它对花费在阻塞上的任何时间(例如I/O或睡眠)视而不见.但是,如果您设置环境变量 CPUPROFILE_REALTIME=1 您可以使其在壁钟时间采样,这样它将看到I/O,睡眠和其他阻止系统调用. 它具有许多输出选项. 似乎很难从行号信息中看到对实际堆栈样本本身的少量随机选择.

gperftools is better (REVISED in response to Aliaksei's comment) because it is a true stack-sampler. Normally it is a "CPU-profiler", in which mode it is blind to any time spent in blocking, like I/O or sleep. However, if you set environment variable CPUPROFILE_REALTIME=1 you can make it sample on wall-clock time, so it will see I/O, sleeps, and other blocking system calls. It has numerous output options. It does not seem to make it easy to see a small random selection of the actual stack samples themselves, with line number information.

现在为肯定:

  • 有一种方法(不是产品)可供许多人使用, 随机暂停 . 这个想法是用质量代替数量-在适当的时间获取堆叠样品 . 在感兴趣的时间间隔内,几乎不需要,例如5、10或20.如果某件事花费了95%的时间,则每个堆栈样本都有95%的机会在正确的时间出现. 然后检查每个堆栈样本以查看发生了什么-不要只是汇总/累加/平均/假装统计信息. (如果这是在调试器上手动完成的,则您还可以检查数据变量,以进一步了解程序为何花那一刻的时间.) 目的是找到问题,而不是测量问题. 您可以看到的任何可以避免的事情,如果您在多个样本上看到它,都将节省大量时间. 这里是多少. 您需要两次查看两次的样本越少,它将节省更多. 如果您想确切地了解它为您节省了多少钱,只需在前后使用秒表即可. 而且不要只做一次.每次解决问题时,您都会发现更多问题,因此,如果继续这样做,您可能可以获得 大幅提高 .
  • There is a method (not a product) that many people use, random pausing. The idea is to substitute quality for quantity - get stack samples at the right time. Very few are needed, like 5, 10, or 20, during the time interval of interest. If something takes 95% of time, every stack sample has a 95% chance of being at the right time. Then examine each stack sample to see what's happening - don't just summarize / accumulate / average / do pretend statistics. (If this is done manually under a debugger, you can also examine data variables, giving even more understanding of why the program is spending that moment in time.) The object is to find the problem, not measure it. Anything you can see that could be avoided, if you see it on more than one sample, will save substantial time. Here's how much. The fewer samples you need to see it twice, the more it will save. If you want to see exactly what it saves you, just use a stopwatch before and after. And don't do it just once. Every time you fix a problem, you uncover more, so if you keep on doing it, you may be able to get dramatic speedups.

这篇关于使用GPerf工具:不起作用,重定向是否有问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆