为什么在 64 模式下默认操作数大小为 32 位? [英] Why is default operand size 32 bits in 64 mode?

查看:41
本文介绍了为什么在 64 模式下默认操作数大小为 32 位?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读英特尔文档卷.1、还有3.6.1的章节64 位模式下的操作数大小和地址大小.有三个前缀REX.W,operand-size 66 和 address-size 67 前缀.并提到操作数的大小默认为 32 位.并且只能使用 REX.W 指令前缀(在其他前缀之后)更改它以使其长度为 64 位.

I am reading Intel doc, vol. 1 and There is a chapter of 3.6.1 Operand Size and Address Size in 64-Bit Mode. There are three prefixes REX.W, operand-size 66 and address-size 67 prefix. And there is mentioned that operand are defaulted to be 32 bit in size. And is only possible to change it with REX.W instruction prefix (after other prefixes) to make it 64 bits long.

我不知道为什么会这样,为什么我不能将完整的 64 位空间用于例如 int 操作数?跟签有关系吗?或者为什么会有这个限制?(所以,C unsigned int 是否使用 REX.W 前缀和对 int 的操作(正如也提到的,前缀只持续特定指令,但不对于整个段,它应该是(大小,地址或操作数的)默认值并包含在段描述符中).

I do not know why so, why cannot I used the full 64 bit space for example for int operand? Does it have something to do with sign? Or why is there this restriction? (so, does C unsigned int uses REX.W prefix with a operation on the int (as there is also mentioned, a prefix lasts only for a particular instruction, but not for the whole segment, which should be (the size, either address or operand's) default and contained in segment descriptor).

我理解正确吗?

推荐答案

TL:DR:您有 2 个单独的问题.1 个关于 C 类型大小,另一个关于 x86-64 机器代码如何编码 32 与 64 位操作数大小.编码选择是相当随意的,可以有所不同.但是 int 是 32 位的,因为这是编译器开发人员选择的,与机器码无关.

TL:DR: you have 2 separate questions. 1 about C type sizes, and another about how x86-64 machine code encodes 32 vs. 64-bit operand-size. The encoding choice is fairly arbitrary and could have been made different. But int is 32-bit because that's what compiler devs chose, nothing to do with machine code.

int 是 32 位的,因为这仍然是一个有用的大小.它使用 int64_t 一半的内存带宽/缓存占用空间.大多数 64 位 ISA 的 C 实现都有 32 位 int,包括 x86-64(x86-64 System V 和 Windows)的主流 ABI.在 Windows 上,即使 long 也是 32 位类型,大概是为了与为 32 位编写的代码兼容,这些代码对类型大小进行了假设.

int is 32-bit because that's still a useful size to use. It uses half the memory bandwidth / cache footprint of int64_t. Most C implementations for 64-bit ISAs have 32-bit int, including both mainstream ABIs for x86-64 (x86-64 System V and Windows). On Windows, even long is a 32-bit type, presumably for source compatibility with code written for 32-bit that made assumptions about type sizes.

此外,当时 AMD 的整数乘法器在 32 位上比 64 位上要快一些,直到 Ryzen 都是这种情况.(第一代 AMD64 芯片是 AMD 的 K8 微架构;有关说明,请参阅 https://agner.org/optimize/表.)

Also, AMD's integer multiplier at the time was somewhat faster for 32-bit than 64-bit, and this was the case until Ryzen. (First-gen AMD64 silicon was AMD's K8 microarchitecture; see https://agner.org/optimize/ for instruction tables.)

在x86-64

x86-64 是 AMD 于 2000 年左右设计的,即 AMD64.英特尔致力于安腾而不参与;x86-64 的所有设计决策均由 AMD 架构师做出.

x86-64 was designed by AMD in ~2000, as AMD64. Intel was committed to Itanium and not involved; all the design decisions for x86-64 were made by AMD architects.

AMD64 在写入 32 位寄存器时采用隐式零扩展设计,因此可以有效地使用 32 位操作数大小 没有您在 8 位和 16 位模式下获得的部分寄存器恶作剧.

AMD64 is designed with implicit zero-extension when writing a 32-bit register, so 32-bit operand-size can be used efficiently with none of the partial-register shenanigans you get with 8 and 16-bit mode.

TL:DR:CPU 有充分的理由希望以某种方式提供 32 位操作数大小,并且 C 类型系统有一个易于访问的 32 位类型. 使用 int 因为这是自然的.

TL:DR: There's good reason for CPUs to want to make 32-bit operand-size available somehow, and for C type systems to have an easily accessible 32-bit type. Using int for that is natural.

如果您想要 64 位操作数大小,请使用它.(然后将其描述为 long long[u]int64_t 给 C 编译器,如果您正在为 asm 全局变量或函数原型编写 C 声明).没有什么能阻止你(除了需要 REX 前缀的更大的代码大小,而你以前可能没有).

If you want 64-bit operand-size, use it. (And then describe it to a C compiler as long long or [u]int64_t, if you're writing C declarations for your asm globals or function prototypes). Nothing's stopping you (except for somewhat larger code size from needing REX prefixes where you might not have before).

所有这些都是与 x86-64 机器码如何编码 32 位操作数大小完全不同的问题.

AMD 选择将 32 位设为默认值,而 64 位操作数大小需要 REX 前缀.

AMD chose to make 32-bit the default and 64-bit operand-size require a REX prefix.

他们本可以反其道而行之,将 64 位操作数大小设为默认值,需要 REX.W=0 将其设置为 32,或者 0x66 操作数大小将其设置为 16. 如果不需要 r8..r15,这可能会导致代码的机器代码更小,这些代码主要操作必须是 64 位的东西(通常是指针).

They could have gone the other way and made 64-bit operand-size the default, requiring REX.W=0 to set it to 32, or 0x66 operand-size to set it to 16. That might have led to smaller machine code for code that mostly manipulates things that have to be 64-bit anyway (usually pointers), if it didn't need r8..r15.

使用 r8..r15 也需要 REX 前缀(即使作为寻址模式的一部分),因此需要大量寄存器的代码通常会发现自己在大多数指令中使用 REX 前缀,即使使用默认操作数大小.

A REX prefix is also required to use r8..r15 at all (even as part of an addressing mode), so code that needs lots of registers often finds itself using a REX prefix on most instructions anyway, even when using the default operand-size.

很多代码确实使用 int 来处理很多东西,所以 32 位操作数大小并不罕见.如上所述,它有时会更快.因此,使最快的指令最紧凑是有意义的(如果您避免使用 r8d..r15d).

A lot of code does use int for a lot of stuff, so 32-bit operand-size is not rare. And as noted above, it's sometimes faster. So it kind of makes sense to make the fastest instructions the most compact (if you avoid r8d..r15d).

如果相同的操作码在 32 位和 64 位模式下以相同的方式解码而没有前缀,这也可能让解码器硬件更简单.我认为这是 AMD 选择这种设计的真正动机.他们当然可以清除很多 x86 问题,但选择不这样做,可能也是为了保持解码更类似于 32 位模式.

It also maybe lets the decoder hardware be simpler if the same opcode decodes the same way with no prefixes in 32 and 64-bit mode. I think this was AMD's real motivation for this design choice. They certainly could have cleaned up a lot of x86 warts but chose not to, probably also to keep decoding more similar to 32-bit mode.

看看您是否会为默认操作数大小为 64 位的 x86-64 版本保存整体代码大小可能会很有趣.例如调整编译器并编译一些现有的代码库.不过,您可能希望教其优化器支持用于 64 位操作数的旧寄存器 RAX..RDI 而不是 32 位,以尽量减少需要 REX 前缀的指令数量.

It might be interesting to see if you'd save overall code size for a version of x86-64 with a default operand-size of 64-bit. e.g. tweak a compiler and compile some existing codebases. You'd want to teach its optimizer to favour the legacy registers RAX..RDI for 64-bit operands instead of 32-bit, though, to try to minimize the number of instructions that need REX prefixes.

(许多指令,如 addimul reg,reg 可以安全地在 64 位操作数大小下使用,即使您只关心低 32,尽管高垃圾会影响 FLAGS 结果.)

(Many instructions like add or imul reg,reg can safely be used at 64-bit operand-size even if you only care about the low 32, although the high garbage will affect the FLAGS result.)

回复:评论中的错误信息:与 32 位机器代码兼容与此无关.64 位模式与现有的 32 位机器码不是二进制兼容的;这就是 x86-64 引入新模式的原因.64 位内核在兼容模式下运行 32 位二进制文​​件,其中解码的工作方式与 32 位保护模式完全相同.

Re: misinformation in comments: compat with 32-bit machine code has nothing to do with this. 64-bit mode is not binary compatible with existing 32-bit machine code; that's why x86-64 introduced a new mode. 64-bit kernels run 32-bit binaries in compat mode, where decoding works exactly like 32-bit protected mode.

https://en.wikipedia.org/wiki/X86-64#OPMODES 有一个有用的模式表,包括长模式(以及 64 位与 32 位和 16 位兼容模式)与传统模式(如果您启动的内核不支持 x86-64).

https://en.wikipedia.org/wiki/X86-64#OPMODES has a useful table of modes, including long mode (and 64-bit vs. 32 and 16-bit compat modes) vs. legacy mode (if you boot a kernel that's not x86-64 aware).

在 64 位模式下,一些操作码是不同的,对于 push/pop 和其他堆栈指令操作码,operand-size 默认为 64 位.

In 64-bit mode some opcodes are different, and operand-size default to 64-bit for push/pop and other stack instruction opcodes.

32 位机器代码在该模式下会错误地解码.例如0x40 在兼容模式下是 inc eax 但在 64 位模式下是 REX 前缀.见 x86-32/x86-64 多语言机器代码片段,在运行时检测 64 位模式? 举个例子.

32-bit machine code would decode incorrectly in that mode. e.g. 0x40 is inc eax in compat mode but a REX prefix in 64-bit mode. See x86-32 / x86-64 polyglot machine-code fragment that detects 64bit mode at run-time? for an example.

还有

64 位模式解码大多类似地是在解码器中共享晶体管,而不是二进制兼容性. 据推测,解码器只有 2 个依赖于模式的默认操作数大小(16 或32 位)用于操作码,例如 03 add r, r/m, 而不是 3. 只有像 push/pop 这样的操作码的特殊外壳才保证它.(另请注意 REX.W=0 让你编码push r32;操作数大小保持在64位.)

64-bit mode decoding mostly similarly is a matter of sharing transistors in the decoders, not binary compatibility. Presumably it's easier for the decoders to only have 2 mode-dependent default operand sizes (16 or 32-bit) for opcodes like 03 add r, r/m, not 3. Only special-casing for opcodes like push/pop that warrant it. (Also note that REX.W=0 does not let you encode push r32; the operand-size stays at 64-bit.)

AMD 的设计决策似乎一直专注于尽可能多地共享解码器晶体管,以防万一 AMD64 没有流行起来,并且在没有人使用的情况下坚持支持它.

AMD's design decisions seem to have been focused on sharing decoder transistors as much as possible, perhaps in case AMD64 didn't catch on and they were stuck supporting it without people using it.

他们本可以做很多微妙的事情来消除 x86 令人讨厌的遗留怪癖,例如在 64 位模式下将 setcc 变成一条 32 位操作数大小的指令,以避免首先需要异或清零.或者 CISC 的烦恼,例如在零计数移位后标志保持不变(尽管 AMD CPU 比英特尔更有效地处理这个问题,所以他们可能故意将其保留.)

They could have done lots of subtle things that removed annoying legacy quirks of x86, for example made setcc a 32-bit operand-size instruction in 64-bit mode to avoid needing xor-zeroing first. Or CISC annoyances like flags staying unchanged after zero-count shifts (although AMD CPUs handle that more efficiently than Intel, so maybe they intentionally left that in.)

或者他们认为细微的调整可能会损害 asm 源代码移植,或者在短期内使编译器后端更难支持 64 位代码生成.

Or maybe they thought that subtle tweaks could hurt asm source porting, or in the short term make it harder to get compiler back-ends to support 64-bit code-gen.

这篇关于为什么在 64 模式下默认操作数大小为 32 位?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆