指令级分析:指令指针的意义? [英] Instruction Level Profiling: The Meaning of the Instruction Pointer?

查看:235
本文介绍了指令级分析:指令指针的意义?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当在汇编指令级分析code,什么是真正的意思给出的指令指针的位置,现代的CPU不执行指令顺序或按顺序?例如,假设下面的64集code:

When profiling code at the the assembly instruction level, what does the position of the instruction pointer really mean given that modern CPUs don't execute instructions serially or in-order? For example, assume the following x64 assembly code:

mov RAX, [RBX];         // Assume a cache miss here.
mov RSI, [RBX + RCX];   // Another cache miss.             
xor R8, R8;        
add RDX, RAX;           // Dependent on the load into RAX.
add RDI, RSI;           // Dependent on the load into RSI.

这指令的指令指针花大部分的时间?我能想到的很好的理由为他们所有人:

Which instruction will the instruction pointer spend most of its time on? I can think of good arguments for all of them:

  • MOV RAX,[RBX] 正在采取可能的周期100秒,因为它是一个高速缓存未命中。
  • MOV RSI,[RBX + RCX] 也需要周期100秒,但可能与previous指令并行执行。是什么甚至意味着指令指针是在一个或另一个的这些?
  • XOR R8,R8 可能执行乱序并完成了内存加载完成之前,但指令指针可能会留在这儿,直到所有previous说明也完成了。
  • 添加RDX,RAX 产生流水线停顿,因为它的指令,其中的值 RAX 之后,实际使用一个缓慢的高速缓存未命中加载到它。
  • 添加RDI,RSI 还摊上,因为它是依赖于负载为 RSI
  • mov RAX, [RBX] is taking probably 100s of cycles because it's a cache miss.
  • mov RSI, [RBX + RCX] also takes 100s of cycles, but probably executes in parallel with the previous instruction. What does it even mean for the instruction pointer to be on one or the other of these?
  • xor R8, R8 probably executes out-of-order and finishes before the memory loads finish, but the instruction pointer might stay here until all previous instructions are also finished.
  • add RDX, RAX generates a pipeline stall because it's the instruction where the value of RAX is actually used after a slow cache-miss load into it.
  • add RDI, RSI also stalls because it's dependent on the load into RSI.

推荐答案

处理器保持了小说,目前只有建筑寄存器(RAX,RBX等),并有特定的指令指针(IP)。程序员和编译器针对这种小说。

CPUs maintains a fiction that there are only the architectural registers (RAX, RBX, etc) and there is a specific instruction pointer (IP). Programmers and compilers target this fiction.

然而,正如你提到的,现代的CPU不连续或按顺序执行。直到你的程序员/用户请求的IP,它就像量子物理中,IP是一波执行指令;这一切让处理器可以运行的程序尽可能快。当你请求当前IP(例如,通过一个调试器断点或探查中断),那么处理器必须重新创建你想到那么它缩短了(飞行中的说明全部)这一波形式的小说,合寄存器值回建筑的名字,并建立一个上下文执行调试程序,等等。

Yet as you noted, modern CPUs don't execute serially or in-order. Until you the programmer / user request the IP, it is like Quantum Physics, the IP is a wave of instructions being executed; all so that the processor can run the program as fast as possible. When you request the current IP (for example, via a debugger breakpoint or profiler interrupt), then the processor must recreate the fiction that you expect so it collapses this wave form (all "in flight" instructions), gathers the register values back into architectural names, and builds a context for executing the debugger routine, etc.

在这种情况下,有一个IP指示指令,其中该处理器应恢复执行。在乱序执行,该指令是最老的指令尚未完成,即使在的时候中断处理器也许是取指令早已过这一点。

In this context, there is an IP that indicates the instruction where the processor should resume execution. During the out-of-order execution, this instruction was the oldest instruction yet to complete, even though at the time of the interrupt the processor was perhaps fetching instructions well past that point.

例如,可能中断指示 MOV RSI,[RBX + RCX]; 作为IP,但 XOR 已执行和完成;然而,当处理器将在中断后继续执行,这将重新执行XOR。

For example, perhaps the interrupt indicates mov RSI, [RBX + RCX]; as the IP, but the xor had already executed and completed; however, when the processor would resume execution after the interrupt, it will re-execute the xor.

这篇关于指令级分析:指令指针的意义?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆