“数据标签"的功能是什么?在x86汇编器中? [英] What is the function of a "data label" in an x86 assembler?

查看:68
本文介绍了“数据标签"的功能是什么?在x86汇编器中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在按照Kip Irvine的汇编语言x86编程"书来学习汇编编程.

I'm currently learning assembly programming by following Kip Irvine's "assembly language x86 programming" book.

在书中,作者试图解释数据标签

In the book, the authors tries to explain the concept of data label

数据标签可识别变量的位置,并提供在代码中引用变量的简便方法.以下,用于例如,定义了一个名为count的变量:

A data label identifies the location of a variable, providing a convenient way to reference the variable in code. The following, for example, defines a variable named count:

count DWORD 100

汇编器为每个标签分配一个数字地址.

The assembler assigns a numeric address to each label.

所以我对数据标签的理解是:数据标签 count 是一个包含数字值的变量,其中数字值是内存中的位置.当我在代码中使用 count 时,实际上是在使用内存中该位置所包含的值,在这种情况下为100.

So my understanding of what data label does is: data label count is a variable that contain a numeric value, where the numeric value is a location in memory. When I use count in my code, I'm actually using the value contained in that location in memory, in this instance, 100.

我对数据标签的理解正确吗?如果有些不正确,可以请别人指出错误吗?

Is my understanding of data label correct? If it is somewhat incorrect, could someone please point the mistake out?

推荐答案

标签是一种写内存地址的象征方式,仅此而已.标签本身不占用任何空间,只是方便您稍后在内存中引用该位置的方法.

Labels are a symbolic way to write memory addresses, nothing more, nothing less. A label itself takes no space, and is just a handy way to let you refer to that spot in memory later.

(嗯,它们还可以在目标文件中变成符号,以允许在链接时而不是在汇编时计算数字地址.但是对于在同一文件中定义和引用的标签,这种额外的复杂性通常是不可见的; 请参阅下面关于地址是链接时间常数,而不是汇编时间的内容.)

(Well, they can also turn into symbols in an object file to allow numeric addresses to be calculated at link time, instead of at assemble time. But for labels defined and referenced in the same file, this extra complexity is mostly invisible; see below about addresses being link-time constants, not assemble-time.)

例如

; NASM syntax, but the concepts apply exactly to MASM as well
; For MASM, you may need  BYTE PTR or whatever size overrides in loads.
section .rodata     ; or section .data  if you want to be able to store here, too.
COUNT:
   db 0x12
FOO:
   db 0
BAR:
   dw 0x80FF      ; same as   db 0xff, 0x80

一个4字节的负载,例如 mov eax,[COUNT] 将得到0x80FF0012(因为x86是little-endian).从 FOO (如 mov cx,[FOO] )加载2字节将得到0xFF00.

A 4-byte load like mov eax, [COUNT] will get 0x80FF0012 (since x86 is little-endian). A 2-byte load from FOO like mov cx, [FOO] will get 0xFF00.

您实际上可能会以这种方式使用常量中的重叠载荷,例如其中一些是其他子字符串的字符串.对于以null终止的字符串,只能将常见的后缀以这种方式组合到相同的存储空间中.

You might actually use overlapping loads from a constant this way, e.g. with strings where some are substrings of others. For null-terminated strings, only common suffixes can be combined into the same storage space this way.

现在这是否意味着 COUNT 是4字节变量还是1字节变量?不,都不是汇编语言实际上没有变量".

Now does this mean that COUNT is a 4-byte variable or a 1-byte variable? No, neither. Assembly language doesn't really have "variables".

变量是一个更高层次的概念,您可以使用带有标签和标签的汇编语言来实现并保留一些静态空间.请注意,在上面的示例中,标签与 db 指令分开.

Variables are a higher-level concept that you can implement in assembly language with a label and an assembler directive that reserves some static space. Notice that the labels are separate from the db directives in the example above.

但是变量不需要任何静态存储空间:例如您的循环计数器变量只能(通常应该)存在于寄存器中.

But a variable doesn't need to have any static storage space: e.g. your loop counter variable can (and often should) exist only in a register.

变量甚至都不需要具有一个固定的位置.它可以在不使用的函数的一部分中溢出,而存放在函数另一部分的寄存器中.在编译器生成的代码中,变量经常无缘无故地在寄存器之间移动,因为编译器甚至没有尝试对同一变量使用相同的寄存器.

A variable doesn't even need to have a single fixed location. It can be spilled to the stack in part of a function where it's not used, but live in registers in another part of a function. In compiler-generated code, variables often move between registers for no reason because compilers don't even try to use the same register for the same variable.

请注意,MASM确实会根据标签后面的指令将标签与操作数大小隐式关联.因此,如果 mov eax,[count] 给出操作数大小的不匹配错误,您可能必须编写 mov eax,dword ptr [count] .

Note that MASM does implicitly associate a label with an operand-size based on the directive that follows it. So you might have to write mov eax, dword ptr [count] if mov eax, [count] gives an operand-size mismatch error.

有些人认为这是一个功能,但其他人则认为这种魔术操作数大小的东西完全是怪异的.NASM语法没有任何这种魔力.您可以知道线的组装方式,而不必去查找标签的定义位置. add [count],1 是NASM中的错误,因为没有任何内容表示操作数大小.

Some people consider this a feature, but others think this magic operand-size stuff is totally weird. NASM syntax doesn't have any of this magic. You can tell how a line will assemble without having to go and find where the labels are defined. add [count], 1 is an error in NASM, because nothing implies an operand-size.

不要固执地认为,您在C语言中使用变量的所有内容都必须具有静态存储,并且在汇编语言程序中必须带有标签.但是,如果您确实想将变量"一词用于静态数据存储+像Kip Irvine一样的标签,那么请继续.

Don't get stuck into thinking that everything you'd use a variable for in C must have static storage with a label in your assembly language programs. But if you do want to use the term "variable" for static data-storage + a label like Kip Irvine does, then go ahead.

还请注意,数据标签不是特殊的,也不不同于代码标签.没有什么可以阻止您编写 jmp COUNT .读者可以将解码为12个FF 80(按x86指令的顺序)作为练习,但是(如果它在具有执行权限的页面中)将由CPU读取和解码.

Also note that data labels are not special or different from code labels. Nothing stops you from writing jmp COUNT. Decoding 12 00 FF 80 as a (sequence of) x86 instruction(s) is left as an exercise for the reader, but (if it's in a page with execute permission), it will be fetched and decoded by the CPU.

类似地,没有什么可以阻止您从代码标签加载数据作为内存操作数.出于性能原因,通常不建议混用代码和数据(所有CPU都使用分离的L1D和L1I缓存),但这也可以.在典型的OS(如Linux)中,可执行文件的文本段包含代码和只读数据段,并以读取和执行许可权进行映射.(但没有写权限,因此除非您修改权限,否则尝试存储会出错.)

Similarly, nothing stops you from loading data from code labels as a memory operand. It's not usually a good idea for performance reasons to mix code and data (all CPUs use split L1D and L1I caches), but that works too. In a typical OS (like Linux), the text segment of an executable contains the code and read-only data sections, and is mapped with read and execute permission. (But not write permission, so trying to store will fault unless you modified the permissions.)

JIT编译器将机器代码写入缓冲区,然后跳转到缓冲区.它可以是带有标签的静态缓冲区,但更常见的是它是地址为变量的动态分配的缓冲区.

A JIT-compiler writes machine code to a buffer and then jumps there. It could be a static buffer with a label, but more usually it would be a dynamically-allocated buffer whose address is a variable.

静态地址通常是链接时常量,但通常不是汇编时常量.(除非您正在编写引导加载程序,或者肯定要在已知地址中加载的其他内容,否则 org 0x100 可能会有用.)这意味着您可以执行 moval [COUNT +2] ,但不是 moval [COUNT * 2] .(目标文件格式支持整数位移,但不支持其他数学运算符.

Static addresses are usually link-time constants, but often not assemble-time constants. (Unless you're writing a bootloader, or something else that is definitely loaded at a known address, then org 0x100 might be useful.) This means you can do mov al, [COUNT+2], but not mov al, [COUNT*2]. (Object-file formats support integer displacements, but not other math operators).

在PIC代码中,标签地址甚至不是链接时间常数,但是至少在64位PIC代码中,从代码到数据标签的偏移量是链接时间常数,因此可以使用RIP相对寻址而无需额外的间接级别(通过全局偏移表").

In PIC code, label addresses are not even link-time constants, but at least in 64-bit PIC code the offset from code to a data label is a link-time constant, so RIP-relative addressing can be used without an extra level of indirection (through the Global Offset Table).

这篇关于“数据标签"的功能是什么?在x86汇编器中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆