使用性能监视原始事件计数器 [英] Using perf to monitor raw event counters

查看:130
本文介绍了使用性能监视原始事件计数器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在具有多个(物理)处理器的(英特尔至强)计算机上测量某些硬件事件.具体来说,我想知道发出了多少次读取离线"数据的请求.

I am trying to measure certain hardware events on a (Intel Xeon) machine with multiple (physical) processors. Specifically, I wish to know how many requests are issued for reading 'offcore' data.

我发现英特尔文档中的OFFCORE_REQUESTS 硬件事件,它为事件描述符提供了0xB0,对于数据需求,它提供了额外的掩码0x01.

I found the OFFCORE_REQUESTS hardware event in Intels documentation and it gives the event descriptor 0xB0 and for data demands, the additional mask 0x01.

然后告诉perf记录事件0xB1(即0xB0 | 0x01)并将其称为:

Would it then be correct to tell perf to record the event 0xB1 (i.e. 0xB0 | 0x01) and to call it as:

perf record -e r0B1 ./mytestapp someargs

这是不正确的吗? 因为perf report对于这样输入的事件不显示任何输出.

Or is this incorrect? Because perf report shows no output for events entered like this.

除了教程条目不会说出它是哪个事件(尽管这个事件对我有用),或者它是如何编码的...

The perf documentation is rather sparse in this area, apart from a tutorial entry which does not say which event it was (though this one works for me), or how it was encoded...

非常感谢您的帮助.

推荐答案

好,所以我想我想通了.

Ok, so I guess I figured it out.

对于我使用的Intel机器,格式如下: <umask><eventselector>其中两个都是十六进制值.可以删除umask的前导零,但不能删除事件选择器.

For the the Intel machine I use, the format is as follows: <umask><eventselector> where both are hexadecimal values. The leading zeros of the umask can be dropped, but not for the event selector.

因此对于带有掩码0x01的事件0xB0,我可以致电:

So for the event 0xB0 with the mask 0x01 I can call:

perf record -e r1B0 ./mytestapp someargs

我无法在perf内核代码(这里有任何内核黑客吗?)中找到它的确切解析,但是我找到了以下来源:

I could not manage to find the exact parsing of it in the perf kernel code (any kernel hacker here?), but I found these sources:

  • A description of the use of perf with raw events in the c't magazine 13/03 (subscription required), which describes some raw events with their description from the Intel Architecture Software Developers Manuel (Vol 3b)
  • A patch on the kernel mailing list, discussing the proper way to document it. It specified that the pattern above was "... was x86 specific and imcomplete at that"
  • (Updated) The man page of newer versions shows an example on Intel machines: man perf-list

更新: 正如评论中指出的(谢谢!),libpfm转换器可用于获取正确的事件描述符.用户"osgx"发现的评论链接中的网站(Bojan Nikolic:如何监视所有CPU性能事件)对此进行了更详细的说明.

Update: As pointed out in the comments (thank you!), the libpfm translator can be used to obtain the proper event descriptor. The website linked in the comments (Bojan Nikolic: How to monitor the full range of CPU performance events), discovered by user 'osgx' explains it in further detail.

这篇关于使用性能监视原始事件计数器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆