性能功耗度量:它如何工作? [英] perf power consumption measure: How does it work?

查看:69
本文介绍了性能功耗度量:它如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到 perf list 现在可以测量功耗了.您可以按以下方式使用它:

I noticed that perf list now has the option to measure power consumption. You can use it as follows:

$ perf stat -e power/energy-cores/ ./a.out 
Performance counter stats for 'system wide':

              8.55 Joules power/energy-cores/

       0.949871058 seconds time elapsed

此测量的准确性如何,以及perf如何估算功耗?

How accurate is this measurement, and how does perf estimate the power consumption?

推荐答案

power/energy-cores/ perf 计数器基于称为的MSR寄存器MSR_PP0_ENERGY_STATUS ,它是Intel RAPL接口的一部分(Intel似乎将每个单独的RAPL MSR称为RAPL接口).基于系统活动事件的复杂模型用于估计(静态和动态)能耗.MSR寄存器名称中具有PP0,它表示电源平面0,电源平面0是RAPL域之一,其中包含套接字的所有核心,包括核心的专用缓存.但是,PP0不包括最后一级的高速缓存,互连,内存控制器,图形处理器以及非核心中的所有其他内容.无法测量 MSR_PP0_ENERGY_STATUS 的准确性,因为没有其他方法只能估算电源平面0的能耗.

The power/energy-cores/ perf counter is based on an MSR register called MSR_PP0_ENERGY_STATUS, which is part of the Intel RAPL interface (Intel seems to call each individual RAPL MSR a RAPL interface). A complicated model based on system activity events is used to estimate (static and dynamic) energy consumption. The MSR register name has PP0 in it, which refers to power plane 0, which is one of the RAPL domains that contains all the cores of the socket including the private caches of the cores. PP0, however, excludes the last-level cache, the interconnect, the memory controller, the graphics processor, and everything else that is in the uncore. It's impossible to measure the accuracy of MSR_PP0_ENERGY_STATUS because there is no other way to estimate the energy consumption of power plane 0 only.

虽然可以测量其他RAPL域的准确性.这些包括Package,DRAM和PSys域.例如,可以通过与整个系统的能耗(可以使用电表测量)进行比较,并运行使包装外部的所有组件的能耗保持已知的工作量,来测量包装"域能耗估算的准确性.尽可能恒定. MSR_PKG_ENERGY_STATUS MSR_DRAM_ENERGY_STATUS 的准确性已由许多人在许多不同的处理器上以不同的方式进行了测量.您可以参考最近发表的题为 RAPL的论文:使用RAPL进行功率测量的经验有关更多信息,其中还包括以前的作品摘要.论文涵盖了桑迪桥,常春藤桥,哈斯韦尔和Skylake.结论是,在Haswell和Skylake上, MSR_PKG_ENERGY_STATUS MSR_DRAM_ENERGY_STATUS 似乎是准确的(在Haswell上,实现已更改,请参见:

It's possible to measure the accuracy of other RAPL domains though. These include the Package, DRAM, and PSys domains. For example, the accuracy of the Package domain energy estimation can be measured by comparing against the energy consumption of the whole system (which can be measured using a power meter) and running a workload that keeps the energy consumption of everything outside the package a known constant as much as possible. The accuracy of MSR_PKG_ENERGY_STATUS and MSR_DRAM_ENERGY_STATUS have been measured in different ways by different people on many different processors. You can refer to the recent paper entitled RAPL in Action: Experiences in Using RAPL for Power Measurements for more information, which also includes summaries of previous works. The paper covers Sandy Bridge, Ivy Bridge, Haswell, and Skylake. The conclusion is that MSR_PKG_ENERGY_STATUS and MSR_DRAM_ENERGY_STATUS appear to be accurate on Haswell and Skylake (the implementation has changed on Haswell, see : An Energy Efficiency Feature Survey of the Intel Haswell Processor). But this is not necessarily true on all kinds of workloads, P states, and processors. So the accuracy does not just depend on the microarchitecture.

在英特尔手册第3卷的第14.9节中讨论了RAPL接口.我注意到该节中有错误.例如,它说客户端处理器不支持DRAM域,这是不正确的.我用来编写此答案的客户端Haswell处理器支持DRAM域.本节可能已过时,仅适用于Sandy Bridge和Ivy Bridge处理器.我认为最好阅读要使用RAPL的处理器的数据表.

The RAPL interface is discussed in Section 14.9 of the Intel Manual Volume 3. I noticed there are errors in the section. For example, it says client processors don't support the DRAM domain, which is not true. The client Haswell processor I'm using to write this answer supports the DRAM domain. The section is probably outdated and applies only Sandy Bridge and Ivy Bridge processors. I think it's better to read the datasheet of the processor on which you want to use RAPL.

power/energy-pkg/ perf 计数器可用于测量封装域的能耗.从Sandy Bridge开始,这是已知的所有Intel处理器都支持的唯一域.

The power/energy-pkg/ perf counter can be used to measure energy consumption of the package domain. This is the only domain that is known be supported on all Intel processors starting from Sandy Bridge.

这篇关于性能功耗度量:它如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆