机器码如何存储在 EXE 文件中? [英] How is machine code stored in the EXE file?

查看:39
本文介绍了机器码如何存储在 EXE 文件中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题如下:

  1. 可移植可执行文件格式(在 Windows/Unix 上)与一般的 x86/x64 指令集有何关系?
  2. PE 格式是存储处理器支持的确切操作码集,还是操作系统转换为与 CPU 匹配的更通用的格式?
  3. EXE 文件如何指示所需的指令集扩展(如 3DNOW!或 SSE/MMX?)
  4. 操作码是否在所有平台(如 Windows、Mac 和 unix)中通用?
  5. 与 Intel i386 兼容的 CPU 芯片(如 Intel 和 AMD 的芯片)使用通用指令集.但我确信 ARM 驱动的 CPU 使用不同的操作码.这些非常不同还是概念相似?寄存器、int/float/double、SIMD 等?

在 .NET、Java 或 Flash 等较新的平台上,指令集是基于堆栈的操作码,JIT 在运行时将其转换为本机格式.习惯了这样的格式,我想知道旧"的原生EXE格式是如何执行和格式化的.例如,寄存器"通常在较新的平台操作码中不可用,因为 JIT 将堆栈命令转换为它认为必要的 16/32 个可用 CPU 寄存器.但是在本机格式中,您需要通过索引来引用寄存器,并确定哪些寄存器可以重用以及重复使用的频率.

解决方案

ARM 操作码与 x86 操作码有很大不同吗?

是的,他们是.您应该假设不同处理器系列的所有指令集完全不同且不兼容.指令集首先定义了一种编码,它指定了其中的一个或多个:

  • 指令操作码;
  • 寻址方式;
  • 操作数大小;
  • 地址大小;
  • 操作数本身.

编码进一步取决于它可以寻址的寄存器数量、是否必须向后兼容、是否必须快速解码以及指令的复杂程度.

关于复杂性:ARM 指令集要求使用专门的加载/存储指令将所有操作数从内存加载到寄存器并从寄存器存储到内存,而 x86 指令可以将单个内存地址编码为它们的操作数之一,因此没有单独的加载/存储指令.

然后是指令集本身:不同的处理器会有专门的指令来处理特定的情况.即使两个处理器系列针对同一事物具有相同的指令(例如,add 指令),它们的编码方式也非常不同,并且语义可能略有不同.

如您所见,由于任何 CPU 设计人员都可以决定所有这些因素,这使得不同处理器系列的指令集架构完全不同且不兼容.

寄存器、int/float/double 和 SIMD 在不同架构上的概念是否有很大不同?

不,它们非常相似.每个现代体系结构都有寄存器并且可以处理整数,并且大多数可以处理一定大小的 IEEE 754 兼容浮点指令.例如,x86 架构具有 80 位浮点值,这些值被截断以适合您知道的 32 位或 64 位浮点值.SIMD 指令背后的思想在支持它的所有架构上也是相同的,但许多架构不支持它,并且大多数对它们有不同的要求或限制.

操作码是否在所有平台(如 Windows、Mac 和 Unix)中通用?

给定三个 Intel x86 系统,一个运行 Windows,一个运行 Mac OS X,一个运行 Unix/Linux,那么 操作码完全相同,因为它们运行在相同的处理器上.但是,每个操作系统都不同.许多方面,例如内存分配、图形、设备驱动程序接口和线程,都需要操作系统特定的代码.因此,您通常无法在 Linux 上运行为 Windows 编译的可执行文件.

PE 格式是否存储了处理器支持的确切操作码集,还是操作系统转换为与 CPU 匹配的更通用的格式?

不,PE 格式不存储操作码集.如前所述,不同处理器系列的指令集架构差异太大,无法实现.PE 文件通常存储特定处理器系列和操作系统系列的机器代码,并且只会在此类处理器和操作系统上运行.

但是有一个例外:.NET 程序集也是 PE 文件,但它们包含不特定于任何处理器或操作系统的通用指令.此类 PE 文件可以在其他系统上运行",但不能直接运行.例如,Linux 上的 mono 可以运行此类 .NET 程序集.

EXE 文件如何指示所需的指令集扩展(如 3DNOW!或 SSE/MMX?)

虽然可执行文件可以指示其构建的指令集(参见 Chris Dodd 的回答),但我不不相信可执行文件可以指示所需的扩展.但是,可执行代码在运行时可以检测到此类扩展.例如,x86 指令集有一个 CPUID 指令,该指令返回该特定 CPU 支持的所有扩展和功能.可执行文件只会测试它并在处理器不满足要求时中止.

.NET 与本机代码

您似乎对 .NET 程序集及其指令集了解一两件事,称为 CIL(通用中间语言).每个 CIL 指令都遵循特定的编码,并使用计算堆栈作为其操作数.CIL 指令集非常通用和高级.当它运行时(在 Windows 上通过 mscoree.dll,在 Linux 上通过 mono)并调用一个方法时,Just-In-Time (JIT) 编译器获取该方法的CIL 指令并将它们编译为机器代码.根据操作系统和处理器系列,编译器必须决定使用哪些机器指令以及如何对其进行编码.编译后的结果存储在内存中的某处.下次调用该方法时,代码会直接跳转到已编译的机器代码,并且可以像本地可执行文件一样高效地执行.

ARM 指令是如何编码的?

我从未使用过 ARM,但通过快速浏览文档,我可以告诉您以下内容.一条 ARM 指令的长度总是 32 位.有许多特殊的编码(例如用于分支和协处理器指令),但 ARM 指令的一般格式是这样的:

<前>31 28 27 26 25 21 20 16+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---|条件 |0 |0 |R/I|操作码 || |操作数 1 |...+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---12 0--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+... |目的地 |操作数 2 |--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

字段含义如下:

  • 条件:当为真时,会导致执行指令的条件.这会查看零、进位、负和溢出标志.设置为 1110 时,始终执行该指令.
  • R/I:当为 0 时,操作数 2 是一个寄存器.当 1 时,操作数 2 是一个常数值.
  • Opcode:指令的操作码.
  • S:当为 1 时,根据指令的结果设置零、进位、负和溢出标志.
  • Operand1:用作第一个操作数的寄存器的索引.
  • 目标:用作目标操作数的寄存器的索引.
  • 操作数 2:第二个操作数.当 R/I 为 0 时,寄存器的索引.当 R/I 为 1 时,一个无符号的 8 位常量值.除了其中之一之外,操作数 2 中的一些位指示该值是否被移位/旋转.

有关更多详细信息,您应该阅读您想了解的特定 ARM 版本的文档.我使用了这个 ARM7TDMI-S 数据表,第 4 章 对于这个例子.

请注意,每条 ARM 指令,无论多么简单,都需要 4 个字节进行编码.由于可能的开销,现代 ARM 处理器允许您使用称为 Thumb 的替代 16 位指令集.它无法表达 32 位指令集所能表达的所有内容,但也只有它的一半.

另一方面,x86-64 指令具有可变长度编码,并使用各种修饰符来调整单个指令的行为.如果您想将 ARM 指令与 x86 和 x86-64 指令的编码方式进行比较,您应该阅读 x86-64 指令编码文章,我在 OSDev.org 上写的.

<小时>

您最初的问题非常广泛.如果你想知道更多,你应该做一些研究,并用你想知道的具体事情创建一个新问题.

My questions are as follows:

  1. How does the Portable Executable format (on Windows/Unix) relate to the x86/x64 instruction set in general?
  2. Does the PE format store the exact set of opcodes supported by the processor, or is it a more generic format that the OS converts to match the CPU?
  3. How does the EXE file indicate the instruction set extensions needed (like 3DNOW! or SSE/MMX?)
  4. Are the opcodes common across all platforms like Windows, Mac and unix?
  5. Intel i386 compatible CPU chips like ones from Intel and AMD use a common instruction set. But I'm sure ARM-powered CPUs use different opcodes. Are these very very different or are the concepts similar? registers, int/float/double, SIMD, etc?

On newer platforms like .NET, Java or Flash, the instruction sets are stack-based opcodes that a JIT converts to the native format at runtime. Being accustomed to such a format I'd like to know how the "old" native EXE format is executed and formatted. For example, "registers" are usually unavailable in newer platform opcodes, since the JIT converts stack commands to the 16/32 available CPU registers as it deems necessary. But in native formats you need to refer to registers by index, and work out which registers can be reused and how often.

解决方案

Are ARM opcodes very different from x86 opcodes?

Yes, they are. You should assume that all instruction sets for different processor families are completely different and incompatible. An instruction set first defines an encoding, which specifies one or more of these:

  • the instruction opcode;
  • the addressing mode;
  • the operand size;
  • the address size;
  • the operands themselves.

The encoding further depends on how many registers it can address, whether it has to be backwards compatible, if it has to be decodable quickly, and how complex the instruction can be.

On the complexity: the ARM instruction set requires all operands to be loaded from memory to register and stored from register to memory using specialized load/store instructions, whereas x86 instructions can encode a single memory address as one of their operands and therefore do not have separate load/store instructions.

Then the instruction set itself: different processors will have specialized instructions to deal with specific situations. Even if two processors families have the same instruction for the same thing (e.g. an add instruction), they are encoded very differently and may have slightly different semantics.

As you see, since any CPU designer can decide on all these factors, this makes the instruction set architectures for different processor families completely different and incompatible.

Are registers, int/float/double and SIMD very different concepts on different architectures?

No they are very similar. Every modern architecture has registers and can handle integers, and most can handle IEEE 754 compatible floating-point instructions of some size. For example, the x86 architecture has 80-bit floating-point values that are truncated to fit the 32-bit or 64-bit floating-point values you know. The idea behind SIMD instructions is also the same on all architectures that support it, but many do not support it and most have different requirements or restrictions for them.

Are the opcodes common across all platforms like Windows, Mac and Unix?

Given three Intel x86 systems, one running Windows, one running Mac OS X and one running Unix/Linux, then yes the opcodes are exactly the same since they run on the same processor. However, each operating system is different. Many aspects such as memory allocation, graphics, device driver interfacing and threading require operating system specific code. So you generally can't run an executable compiled for Windows on Linux.

Does the PE format store the exact set of opcodes supported by the processor, or is it a more generic format that the OS converts to match the CPU?

No, the PE format does not store the set of opcodes. As explained earlier, the instruction set architectures of different processor families are simply too different to make this possible. A PE file usually stores machine code for one specific processor family and operating system family, and will only run on such processors and operating systems.

There is however one exception: .NET assemblies are also PE files but they contain generic instructions that are not specific to any processor or operating system. Such PE files can be 'run' on other systems, but not directly. For example, mono on Linux can run such .NET assemblies.

How does the EXE file indicate the instruction set extensions needed (like 3DNOW! or SSE/MMX?)

While the executable can indicate the instruction set for which it was built (see Chris Dodd's answer), I don't believe the executable can indicate the extensions that are required. However, the executable code, when run, can detect such extensions. For example, the x86 instruction set has a CPUID instruction that returns all the extensions and features supported by that particular CPU. The executable would just test that and abort when the processor does not meet the requirements.

.NET versus native code

You seem to know a thing or two about .NET assemblies and their instruction set, called CIL (Common Intermediate Language). Each CIL instruction follows a specific encoding and uses the evaluation stack for its operands. The CIL instruction set is kept very general and high-level. When it is run (on Windows by mscoree.dll, on Linux by mono) and a method is called, the Just-In-Time (JIT) compiler takes the method's CIL instructions and compiles them to machine code. Depending on the operating system and processor family the compiler has to decide which machine instructions to use and how to encode them. The compiled result is stored somewhere in memory. The next time the method is called the code jumps directly to the compiled machine code and can execute just as efficiently as a native executable.

How are ARM instructions encoded?

I have never worked with ARM, but from a quick glance at the documentation I can tell you the following. An ARM instruction is always 32-bits in length. There are many exceptional encodings (e.g. for branching and coprocessor instructions), but the general format of an ARM instruction is like this:

31             28  27  26  25              21  20              16
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--
|   Condition   | 0 | 0 |R/I|    Opcode     | S |   Operand 1   | ...
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+--

                   12                                               0
  --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
... |  Destination  |               Operand 2                       |
  --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+

The fields mean the following:

  • Condition: A condition that, when true, causes the instruction to be executed. This looks at the Zero, Carry, Negative and Overflow flags. When set to 1110, the instruction is always executed.
  • R/I: When 0, operand 2 is a register. When 1, operand 2 is a constant value.
  • Opcode: The instruction's opcode.
  • S: When 1, the Zero, Carry, Negative and Overflow flags are set according to the instruction's result.
  • Operand1: The index of a register that is used as the first operand.
  • Destination: The index of a register that is used as the destination operand.
  • Operand 2: The second operand. When R/I is 0, the index of a register. When R/I is 1, an unsigned 8-bit constant value. In addition to either one of these, some bits in operand 2 indicate whether the value is shifted/rotated.

For more detailed information you should read the documentation for the specific ARM version you want to know about. I used this ARM7TDMI-S Data Sheet, Chapter 4 for this example.

Note that each ARM instruction, no matter how simple, takes 4 bytes to encode. Because of the possible overhead, the modern ARM processors allow you to use an alternative 16-bit instruction set called Thumb. It cannot express all the things the 32-bit instruction set can, but it is also half as big.

On the other hand, x86-64 instructions have a variable length encoding, and use all kinds of modifiers to adjust the behavior of individual instructions. If you want to compare the ARM instructions with how x86 and x86-64 instructions are encoded, you should read the x86-64 Instruction Encoding article that I wrote on OSDev.org.


Your original question is very broad. If you want to know more, you should do some research and create a new question with the specific thing you want to know.

这篇关于机器码如何存储在 EXE 文件中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆