在x86-64中将寄存器移至自身会有什么好处 [英] what would be the benefit of moving a register to itself in x86-64

查看:76
本文介绍了在x86-64中将寄存器移至自身会有什么好处的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在x86-64 NASM中做一个项目,遇到了以下说明:

I'm doing a project in x86-64 NASM and came across the instruction:

mov rdi, rdi

在我的教授写的编译器输出中.

in the output of a compiler my professor wrote.

我已经搜索了所有内容,但是找不到提及为什么需要这样做的信息.它会影响标志吗?还是我不理解的聪明之处?

I have searched all over but can't find mention of why this would be needed. Does it affect the flags or is it something clever that I don't understand?

要提供一些上下文信息,它会在循环中出现,恰好在同一寄存器中用sub递减.

To give some context it's present in a loop right before the same register is decremented with sub.

推荐答案

指令mov rdi, rdi只是一个低效的3字节NOP,相当于实际的

The instruction mov rdi, rdi is just an inefficient 3 byte NOP, equivalent to an actual NOP instruction. Assembling it, it generates the byte combination

48 89 ff       mov rdi, rdi

可以将其视为NOP,因为它既不会影响标志,也不会影响寄存器.唯一的架构效果是使程序计数器前进到下一条指令.

That can be considered as a NOP because it does neither affect the flags nor the registers. The only architectural effect is to advance the program counter to the next instruction.

通常使用(多字节)NOP将下一条指令对齐到某个地址,一个流行的例子是对齐的跳转目标,尤其是在循环的顶部.

It's common to use (multi-byte) NOPs to align the next instruction to a certain address, a popular example being an aligned jump target, especially at the top of a loop.

但是在这种情况下,这似乎只是非优化编译器生成的代码的产物,不是用于故意填充.

But in this case, it appears it's just an artifact of code-generation from a non-optimizing compiler, not being used for intentional padding.

与真正的nop相比,它效率低下,因为在特殊情况下它不会更便宜地运行. (它的 micro 建筑效果在当前的CPU上是不同的).它通过RDI向依赖关系链增加了一个延迟周期,并使用了ALU执行单元. (Intel和AMD CPU都无法消除" mov same,same并在寄存器重命名阶段以零延迟运行它,仅在不同的架构寄存器之间运行.例如,mov rax,rdi的价格与IvyBridge +上的nop一样便宜,并且Ryzen,如果您不介意破坏RAX.)

It's inefficient compared to a true nop because it won't be special-cased to run more cheaply. (Its microarchitectural effect is different on current CPUs). It adds a cycle of latency to the dependency chain through RDI, and uses an ALU execution unit. (Neither Intel nor AMD CPUs can "eliminate" mov same,same and run it with zero latency in the register-rename stage, only between different architectural registers. mov rax,rdi for example can be about as cheap as a nop on IvyBridge+ and Ryzen, if you don't mind clobbering RAX.)

根据您的情况,您应该删除它(而不是将其替换为66 66 90(带有冗余操作数大小前缀的短NOP)或01 1F 00(长NOP),因为它没有用于填充.

In your case, you should just remove it (instead of replacing it with 66 66 90 (short NOP with redundant operand-size prefixes) or 01 1F 00 (long NOP), because it's not being used for padding.

这篇关于在x86-64中将寄存器移至自身会有什么好处的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆