LEA 或 ADD 指令? [英] LEA or ADD instruction?

查看:19
本文介绍了LEA 或 ADD 指令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在手写组装的时候,一般都会选择表格

When I'm handwriting assembly, I generally choose the form

lea eax, [eax+4]

在表格上..

add eax, 4

我听说 lea 是一个0 时钟"指令(如 NOP),而 'add' 不是.但是,当我查看编译器生成的程序集时,我经常看到使用后一种形式而不是第一种形式.我足够聪明,可以信任编译器,所以任何人都可以阐明哪个更好?哪个更快?为什么编译器会选择后者而不是前者?

I have heard that lea is a "0-clock" instruction (like NOP), while 'add' isn't. However, when I look at compiler produced Assembly I often see the latter form used instead of the first. I'm smart enough to trust the compiler, so can anyone shed some light on which one is better? Which one is faster? Why is the compiler choosing the latter form over the former?

推荐答案

LEAADD 在 x86 CPU 上的一个显着区别是实际执行指令的执行单元.现代 x86 CPU 是超标量的,并且有多个并行操作的执行单元,管道为它们提供数据有点像循环(酒吧停顿).事实是,LEA 由处理寻址的单元(之一)处理(发生在管道的早期阶段),而 ADD 则转到ALU(s)(算术/逻辑单元),在管道后期.这意味着超标量 x86 CPU 可以同时执行 LEA 和算术/逻辑指令.

One significant difference between LEA and ADD on x86 CPUs is the execution unit which actually performs the instruction. Modern x86 CPUs are superscalar and have multiple execution units that operate in parallel, with the pipeline feeding them somewhat like round-robin (bar stalls). Thing is, LEA is processed by (one of) the unit(s) dealing with addressing (which happens at an early stage in the pipeline), while ADD goes to the ALU(s) (arithmetic / logical unit), and late in the pipeline. That means a superscalar x86 CPU can concurrently execute a LEA and an arithmetic/logical instruction.

LEA通过地址生成逻辑而不是算术单元这一事实也是它过去被称为零时钟"的原因;执行不需要时间,因为地址生成已经发生到它将被执行时.

The fact that LEA goes through the address generation logic instead of the arithmetic units is also the reason why it used to be called "zero-clocks"; it takes no time to execute because address generation has already happened by the time it would be / is executed.

它不是免费,因为地址生成是执行管道中的一个步骤,但它没有执行开销.并且它不占用 ALU 流水线中的一个插槽.

It's not free, since address generation is a step in the execution pipeline, but it's got no execution overhead. And it doesn't occupy a slot in the ALU pipeline(s).

编辑:澄清一下,LEA 不是免费的.即使在没有通过算术单元实现它的 CPU 上,由于指令解码/分派/退役和/或所有指令都经过的其他流水线阶段,它也需要时间来执行.执行 LEA 所需的时间只是发生在管道的不同阶段,对于通过地址生成实现它的 CPU.

To clarify, LEA is not free. Even on CPUs that do not implement it via the arithmetic unit it takes time to execute due to instruction decode / dispatch / retire and/or other pipeline stages that all instructions go through. The time taken to do LEA just occurs in a different stage of the pipeline for CPUs that implement it via address generation.

这篇关于LEA 或 ADD 指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆