x86 inc 与 add 指令的相对性能 [英] Relative performance of x86 inc vs. add instruction
问题描述
快速问题,事先假设
mov eax, 0
哪个更有效?
inc eax
inc eax
或
add eax, 2
此外,如果两个 inc
更快,编译器(例如 GCC)是否通常(即没有积极的优化标志)优化 var += 2
到了吗?
Also, in case the two inc
s are faster, do compilers (say, the GCC) commonly (i.e. w/o aggressive optimization flags) optimize var += 2
to it?
PS:不要用不要过早优化"的变体来回答,这只是学术兴趣.
PS: Don't bother to answer with a variation of "don't prematurely optimize", this is merely academic interest.
推荐答案
在同一个寄存器上的两个 inc
指令(或更一般地说,两个读-修改-写指令)确实总是有一个依赖链至少两个周期.这是假设一个 inc 的一个时钟延迟,自 486 以来就是这种情况.这意味着如果周围的指令不能与两条 inc 指令交错以隐藏这些延迟,代码将执行得更慢.
Two inc
instructions on the same register (or more generally speaking two read-modify-write instructions) do always have a dependency chain of at least two cycles. This is assuming a one clock latency for a inc, which is the case since the 486. That means if the surrounding instructions can't be interleaved with the two inc instructions to hide those latencies, the code will execute slower.
但是无论如何编译器都不会发出您建议的指令序列(mov eax,0
将被替换为 xor eax,eax
,参见 寄存器与自身异或的目的是什么?)
But no compiler will emit the instruction sequence you propose anyway (mov eax,0
will be replaced by xor eax,eax
, see What is the purpose of XORing a register with itself?)
mov eax,0
inc eax
inc eax
将优化为
mov eax,2
这篇关于x86 inc 与 add 指令的相对性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!