基于Linux时间样本的探查器 [英] Linux time sample based profiler

查看:113
本文介绍了基于Linux时间样本的探查器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

简短版本:

Linux是否有基于时间的采样分析器?

Is there a good time based sampling profiler for Linux?

长版:

我通常使用 OProfile 优化我的应用程序.我最近发现了一个让我纳闷的缺点.

I generally use OProfile to optimize my applications. I recently found a shortcoming that has me wondering.

问题是一个紧密的循环,产生了c ++ filt来分解一个c ++名称.我只是在追查另一个瓶颈时偶然偶然发现了该代码. OProfile没有显示任何与代码异常的内容,因此我几乎忽略了它,但是我的代码意识告诉我优化调用并查看发生了什么.我将c ++ filt的popen更改为abi::__cxa_demangle.运行时间从一分钟多到一秒多一点.大约是x60的速度.

The problem was a tight loop, spawning c++filt to demangle a c++ name. I only stumbled upon the code by accident while chasing down another bottleneck. The OProfile didn't show anything unusual about the code so I almost ignored it but my code sense told me to optimize the call and see what happened. I changed the popen of c++filt to abi::__cxa_demangle. The runtime went from more than a minute to a little over a second. About a x60 speed up.

有没有一种方法可以配置OProfile来标记popen调用?现在,由于配置文件数据已经存在,OProfile认为瓶颈是堆和std::string调用(BTW曾经进行了优化,将运行时间缩短到了不到一秒钟,加快了x2的速度).

Is there a way I could have configured OProfile to flag the popen call? As the profile data sits now OProfile thinks the bottle neck was the heap and std::string calls (which BTW once optimized dropped the runtime to less than a second, more than x2 speed up).

这是我的OProfile配置:

Here is my OProfile configuration:

$ sudo opcontrol --status
Daemon not running
Event 0: CPU_CLK_UNHALTED:90000:0:1:1
Separate options: library
vmlinux file: none
Image filter: /path/to/executable
Call-graph depth: 7
Buffer size: 65536

是否还有另一个Linux探查器可以发现瓶颈?

Is there another profiler for Linux that could have found the bottleneck?

我怀疑问题是OProfile仅将其样本记录到当前正在运行的进程中.我希望它始终将其样本记录到我正在分析的过程中.因此,如果该进程当前已关闭(在IO上阻塞或popen调用),OProfile只会将其样本放在被阻塞的调用上.

I suspect the issue is that OProfile only logs its samples to the currently running process. I'd like it to always log its samples to the process I'm profiling. So if the process is currently switched out (blocking on IO or a popen call) OProfile would just place its sample at the blocked call.

如果无法解决此问题,则OProfile仅在可执行文件的CPU使用率接近100%时才有用.对于阻塞调用效率低下的可执行文件无济于事.

If I can't fix this, OProfile will only be useful when the executable is pushing near 100% CPU. It can't help with executables that have inefficient blocking calls.

推荐答案

您问的很高兴.我相信OProfile可以做我认为正确的事情,那就是在程序运行缓慢的时候在壁钟时间上获取堆栈样本,如果这样做不会让您检查个人的话堆栈样本,至少汇总出现在样本中的每一行代码,即出现在该行中的样本的百分比.这是该行不存在时将保存的内容的直接度量. 这里是一个讨论. 另一个.而且,正如保罗所说,缩放应该做到这一点.

Glad you asked. I believe OProfile can be made to do what I consider the right thing, which is to take stack samples on wall-clock time when the program is being slow and, if it won't let you examine individual stack samples, at least summarize for each line of code that appears on samples, the percent of samples the line appears on. That is a direct measure of what would be saved if that line were not there. Here's one discussion. Here's another, and another. And, as Paul said, Zoom should do it.

如果您的时间从60秒缩短到1秒,则意味着每个堆栈样本都有59/60的概率向您显示问题.

If your time went from 60 sec to 1 sec, that implies every single stack sample would have had a 59/60 probability of showing you the problem.

这篇关于基于Linux时间样本的探查器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆