FLOP测量 [英] FLOP measurement

查看:75
本文介绍了FLOP测量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用intel vtune Amplifier估计我的应用程序的FLOPS,并且我在此处使用此帖子作为准则:解决方案

可用事件集可以在处理器世代之间改变.准确知道您的处理器名称很重要.您提到的事件存在于Intel Xeon v2(基于Ivybridge)上,您可以使用以下公式来衡量浮点运算的数量:FP_COMP_OPS_EXE.SSE_SCALAR_SINGLE + 4 * FP_COMP_OPS_EXE.SSE_PACKED_SINGLE + 8 * SIMD_FP_256.PACKED_SINGLE + FP_COMP_OPS_EXE.SSE_SCALAR_DOUBLE + 2 * FP_COMP_OPS_EXED_SIMD_4

对于基于Haswell的处理器(Xeon v3),没有此类事件,并且无法进行FLOP计算.

对于基于Broadwell的公式,将如下所示:FP_ARITH_INST_RETIRED.SCALAR_SINGLE + 4 * FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE + 8 * FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE + FP_ARITH_INST_RETIRED.SCALAR_DOUBLE + 2 * FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE + 4 * FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE + INST_RETIRED.X87

I'm trying to estimate FLOPS for my application using intel vtune Amplifier and I'm using this post here as a guideline : https://software.intel.com/en-us/articles/estimating-flops-using-event-based-sampling-ebs/

The problem is that I can't find the FP_COMP_OPS_EXE event in vtune gui. When I run amplxe-cl with this event config, I get the following error:

amplxe: Error: Invalid Event FP_COMP_OPS_EXE.X87 discarded.

I'm working on CentOS and my processor is intel Xeon

Any help would be appreciated

解决方案

The available events set can change between processors generations. It is important to know exactly your processor name. The event you mentioned exist for Intel Xeon v2 (Ivybridge based) and you can use following formula to measure the number of floating points operations: FP_COMP_OPS_EXE.SSE_SCALAR_SINGLE + 4 * FP_COMP_OPS_EXE.SSE_PACKED_SINGLE + 8 * SIMD_FP_256.PACKED_SINGLE + FP_COMP_OPS_EXE.SSE_SCALAR_DOUBLE + 2 * FP_COMP_OPS_EXE.SSE_PACKED_DOUBLE + 4 * SIMD_FP_256.PACKED_DOUBLE + FP_COMP_OPS_EXE.X87

For Haswell based processors (Xeon v3) there are no such events and FLOPs calculation is not possible there.

For Broadwell based the formula will be following: FP_ARITH_INST_RETIRED.SCALAR_SINGLE + 4 * FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE + 8 * FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE + FP_ARITH_INST_RETIRED.SCALAR_DOUBLE + 2 * FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE + 4 * FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE + INST_RETIRED.X87

这篇关于FLOP测量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆