x86_64 寄存器 rax/eax/ax/al 覆盖全部寄存器内容 [英] x86_64 registers rax/eax/ax/al overwriting full register contents

查看:123
本文介绍了x86_64 寄存器 rax/eax/ax/al 覆盖全部寄存器内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

正如广泛宣传的那样,现代 x86_64 处理器具有 64 位寄存器,可以以向后兼容的方式用作 32 位寄存器、16 位寄存器甚至 8 位寄存器,例如:

As it is widely advertised, modern x86_64 processors have 64-bit registers that can be used in backward-compatible fashion as 32-bit registers, 16-bit registers and even 8-bit registers, for example:

0x1122334455667788
  ================ rax (64 bits)
          ======== eax (32 bits)
              ====  ax (16 bits)
              ==    ah (8 bits)
                ==  al (8 bits)

这样的方案可以从字面上理解,即人们总是只能使用指定的名称访问寄存器的一部分以进行读取或写入目的,这将是非常合乎逻辑的.事实上,对于 32 位以下的所有内容都是如此:

Such a scheme may be taken literally, i.e. one can always access only the part of the register using a designated name for reading or writing purposes, and it would be highly logical. In fact, this is true for everything up to 32-bit:

mov  eax, 0x11112222 ; eax = 0x11112222
mov  ax, 0x3333      ; eax = 0x11113333 (works, only low 16 bits changed)
mov  al, 0x44        ; eax = 0x11113344 (works, only low 8 bits changed)
mov  ah, 0x55        ; eax = 0x11115544 (works, only high 8 bits changed)
xor  ah, ah          ; eax = 0x11110044 (works, only high 8 bits cleared)
mov  eax, 0x11112222 ; eax = 0x11112222
xor  al, al          ; eax = 0x11112200 (works, only low 8 bits cleared)
mov  eax, 0x11112222 ; eax = 0x11112222
xor  ax, ax          ; eax = 0x11110000 (works, only low 16 bits cleared)

然而,一旦我们使用 64 位的东西,事情似乎就相当尴尬了:

However, things seem to be fairly awkward as soon as we get to 64-bit stuff:

mov  rax, 0x1111222233334444 ;           rax = 0x1111222233334444
mov  eax, 0x55556666         ; actual:   rax = 0x0000000055556666
                             ; expected: rax = 0x1111222255556666
                             ; upper 32 bits seem to be lost!
mov  rax, 0x1111222233334444 ;           rax = 0x1111222233334444
mov  ax, 0x7777              ;           rax = 0x1111222233337777 (works!)
mov  rax, 0x1111222233334444 ;           rax = 0x1111222233334444
xor  eax, eax                ; actual:   rax = 0x0000000000000000
                             ; expected: rax = 0x1111222200000000
                             ; again, it wiped whole register

这种行为对我来说似乎非常荒谬和不合逻辑.看起来以任何方式尝试向 eax 写入任何内容都会导致擦除 rax 寄存器的高 32 位.

Such behavior seems to be highly ridiculous and illogical to me. It looks like trying to write anything at all to eax by any means leads to wiping of high 32 bits of rax register.

所以,我有两个问题:

  1. 我相信必须在某处记录这种尴尬的行为,但我似乎无法在任何地方找到详细说明(64 位寄存器的高 32 位究竟是如何被擦除的).我对 eax 的写入总是擦除 rax 是对的,还是更复杂的东西?它是否适用于所有 64 位寄存器,还是有一些例外?

  1. I believe that this awkward behavior must be documented somewhere, but I can't seem to find detailed explanation (of how exactly high 32 bits of 64-bit register get wiped) anywhere. Am I right that writing to eax always wipes rax, or it's something more complicated? Does it apply to all 64-bit registers, or there are some exceptions?

一个强烈相关的问题 提到了相同的行为,但是,唉,再次没有对文档的确切引用.

A strongly related question mentions the same behavior, but, alas, there are again no exact references to documentation.

换句话说,我想要一个指向指定此行为的文档的链接.

In other words, I'd like a link to documentation that specifies this behavior.

是我还是这整件事似乎真的很奇怪和不合逻辑(即 eax-ax-ah-al、rax-ax-ah-al 有一种行为而 rax-eax 有另一种行为)?可能是我在这里遗漏了一些重要的一点,为什么要这样实施?

Is it just me or this whole thing seems to be really weird and illogical (i.e. eax-ax-ah-al, rax-ax-ah-al having one behavior and rax-eax having another)? May be I'm missing some kind of vital point here on why was it implemented like that?

对为什么"的解释将不胜感激.

An explanation on "why" would be highly appreciated.

推荐答案

Intel/AMD 处理器手册中记录的处理器模型对于现代内核的真实执行引擎来说是一个非常不完美的模型.尤其是处理器寄存器的概念与现实不符,不存在 EAX 或 RAX 寄存器之类的东西.

The processor model as documented in the Intel/AMD processor manual is a pretty imperfect model for the real execution engine of a modern core. In particular, the notion of the processor registers does not match reality, there is no such thing as a EAX or RAX register.

指令解码器的一项主要工作是将传统的 x86/x64 指令转换为微操作,即类 RISC 处理器的指令.易于并发执行并能够利用多个执行子单元的小指令.允许同时执行多达 6 条指令.

One primary job of the instruction decoder is to convert the legacy x86/x64 instructions into micro-ops, instructions of a RISC-like processor. Small instructions that are easy to execute concurrently and being able to take advantage of multiple execution sub-units. Allowing as many as 6 instructions to execute at the same time.

为了实现这一点,处理器寄存器的概念也被虚拟化了.指令解码器从一大组寄存器中分配一个寄存器.当指令退出时,动态分配的寄存器的值会被写回到当前保存 RAX 值的任何寄存器中.

To make that work, the notion of processor registers is virtualized as well. The instruction decoder allocates a register from a big bank of registers. When the instruction is retired, the value of that dynamically allocated register is written back to whatever register currently holds the value of, say, RAX.

为了使这些操作顺利有效地工作,允许许多指令同时执行,这些操作之间没有相互依赖是非常重要的.最糟糕的情况是寄存器值取决于其他指令.EFLAGS 寄存器臭名昭著,很多指令修改它.

To make that work smoothly and efficiently, allowing many instructions to execute concurrently, it is very important that these operations don't have an interdependency. And the worst kind you can have is that the register value depends on other instructions. The EFLAGS register is notorious, many instructions modify it.

喜欢它的工作方式也有同样的问题.大问题,它需要在指令退出时合并两个寄存器值.创建会阻塞核心的数据依赖项.通过强制高 32 位为 0,依赖立即消失,不再需要合并.Warp 9 执行速度.

Same problem with the way you like it to work. Big problem, it requires two register values to be merged when the instruction is retired. Creating a data dependency that's going to clog up the core. By forcing the upper 32-bit to 0, that dependency instantly disappears, no longer a need to merge. Warp 9 execution speed.

这篇关于x86_64 寄存器 rax/eax/ax/al 覆盖全部寄存器内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆