x86 inc 与 add 指令的相对性能 [英] Relative performance of x86 inc vs. add instruction

查看:29
本文介绍了x86 inc 与 add 指令的相对性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

快速问题,事先假设

mov eax, 0

哪个更有效?

inc eax
inc eax

add eax, 2

此外,如果两个 inc 更快,编译器(例如 GCC)是否通常(即没有积极的优化标志)优化 var += 2 到了吗?

Also, in case the two incs are faster, do compilers (say, the GCC) commonly (i.e. w/o aggressive optimization flags) optimize var += 2 to it?

PS:不要用不要过早优化"的变体来回答,这只是学术兴趣.

PS: Don't bother to answer with a variation of "don't prematurely optimize", this is merely academic interest.

推荐答案

在同一个寄存器上的两个 inc 指令(或更一般地说,两个读-修改-写指令)确实总是有一个依赖链至少两个周期.这是假设一个 inc 的一个时钟延迟,自 486 以来就是这种情况.这意味着如果周围的指令不能与两条 inc 指令交错以隐藏这些延迟,代码将执行得更慢.

Two inc instructions on the same register (or more generally speaking two read-modify-write instructions) do always have a dependency chain of at least two cycles. This is assuming a one clock latency for a inc, which is the case since the 486. That means if the surrounding instructions can't be interleaved with the two inc instructions to hide those latencies, the code will execute slower.

但是无论如何编译器都不会发出您建议的指令序列(mov eax,0 将被替换为 xor eax,eax,参见 寄存器与自身异或的目的是什么?)

But no compiler will emit the instruction sequence you propose anyway (mov eax,0 will be replaced by xor eax,eax, see What is the purpose of XORing a register with itself?)

mov eax,0
inc eax
inc eax

将优化为

mov eax,2

这篇关于x86 inc 与 add 指令的相对性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆