$ 在 NASM 中究竟是如何工作的? [英] How does $ work in NASM, exactly?

查看:37
本文介绍了$ 在 NASM 中究竟是如何工作的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

message db "Enter a digit ", 0xA,0xD
Length equ $- message

是用来获取字符串长度的吗?
它在内部如何运作?

Is it used to get the length of a string?
How does it work internally?

推荐答案

这让汇编器在汇编时为你计算字符串长度

$ 是当前位置的地址之前 为它出现的行发出字节(如果有的话).手册第 3.5 节没有详细介绍.

This gets the assembler to calculate the string length for you at assemble time

$ is the address of the current position before emitting the bytes (if any) for the line it appears on. Section 3.5 of the manual doesn't go into much detail.

$ - msg 就像在here - msg 中做,即当前位置(在字符串的末尾)和开始之间的以字节为单位的距离的字符串.(关于 NASM 标签和指令(如 resb)

$ - msg is like doing here - msg, i.e. the distance in bytes between the current position (at the end of the string) and the start of the string. (See also this tutorial on NASM labels and directives like resb)

(相关:大多数其他 x86 汇编器也以相同的方式使用 $,除了使用 的 GAS.(句点).MMIX 汇编器 使用具有正确语义的 @.

(Related: Most other x86 assemblers also use $ the same way, except for GAS which uses . (period). MMIX assembler uses @, which has the right semantic meaning).

为了更好地理解它,看看当你出错时会发生什么可能会有所帮助:在内存中相邻的 NASM 标签打印两个字符串而不是第一个.这个人用过

To understand it better, it may help to see what happens when you get it wrong: In NASM labels next to each other in memory are printing both strings instead of first one. This person used

HELLO_MSG db 'Hello, World!',0    ; normally you don't want ,0
GOODBYE_MSG db 'Goodbye!',0       ; in explicit-length strings, unless it also needs to be a C-string

hlen equ $ - HELLO_MSG
glen equ $ - GOODBYE_MSG

导致 hlen 包括两个字符串的长度.

resulting in hlen including the length of both strings.

EQU 立即将右侧计算为常数值.(在一些像 FASM 这样的汇编程序中,equ 是一个文本替换,你必须使用 glen = $ - GOODBYE_MSG 在这个位置用 $ 进行评估, 而不是在后面的 mov ecx, glen 指令或其他指令中评估 $.但是 NASM 的 equ 会在现场评估;使用 %定义用于文本替换)

EQU evaluates the right hand side right away, to a constant value. (In some assemblers like FASM, equ is a text substitution and you have to use glen = $ - GOODBYE_MSG to evaluate with $ at this position, instead of evaluating $ in a later mov ecx, glen instruction or something. But NASM's equ evaluates on the spot; use %define for text substitutions)

使用 $ 完全等同于在行首放置一个标签并使用它代替 $.

Using $ is exactly equivalent to putting a label at the start of the line and using it instead of $.

对象大小示例也可以使用常规标签完成:

The object-size example can also be done using regular labels:

msg:   db "Enter a digit "
msgend: 
Length equ msgend - msg
Length2 equ $ - msg     ; Length2 = Length

newline: db 0xA,0xD
Length3 equ $ - msg     ; Length3 includes the 

 LF CR sequence as well.
                        ; sometimes that *is* what you want

你可以把Length equ msgend - msg放在任何地方,或者直接把mov ecx, msgend - msg.(有时在某些内容的末尾有一个标签很有用,例如 cmp rsi, msgend/jb .loop 在循环的底部.

You can put Length equ msgend - msg anywhere, or mov ecx, msgend - msg directly. (It's sometimes useful to have a label on the end of something, e.g. cmp rsi, msgend / jb .loop at the bottom of a loop.

顺便说一句,通常是 CR LF,而不是 LF CR.

BTW, it's usually CR LF, not LF CR.

times 4  dd $

与此相同(但不创建符号表条目或与现有名称冲突):

assembles the same as this (but without creating a symbol table entry or clashing with an existing name):

here:    times 4 dd here

times 4 dd $中,$不会为每个dword更新到它自己的地址,它仍然是行首的地址.(在文件中单独尝试,然后对平面二进制文件进行十六进制转储:全部为零.)

In times 4 dd $, $ doesn't update to its own address for each dword, it's still the address of the start of the line. (Try it in a file by itself and hexdump the flat binary: it's all zeros.)

但是 %rep 块在 $ 之前被展开,所以

But a %rep block is expanded before $, so

%rep 4
    dd $
%endrep

确实产生 0, 4, 8, 12(对于本示例,从平面二进制中 0 的输出位置开始.)

does produce 0, 4, 8, 12 (starting from an output position of 0 in a flat binary for this example.)

$ nasm -o foo  rep.asm  && hd foo
00000000  00 00 00 00 04 00 00 00  08 00 00 00 0c 00 00 00  


手动编码跳跃位移:

一个正常的直接callE8 rel32,计算相对于指令end的位移.(即在指令执行时相对于 EIP/RIP,因为 RIP 保存下一条指令的地址.RIP 相关寻址模式也以这种方式工作.)一个双字是 4 个字节,所以在 dd 一个操作数的伪指令,结束地址为$+4.您当然可以在 next 行上放置一个标签并使用它.


Manually encoding jump displacements:

A normal direct call is E8 rel32, with the displacement calculated relative to the end of the instruction. (i.e. relative to EIP/RIP while the instruction is executing, because RIP holds the address of the next instruction. RIP-relative addressing modes work this way, too.) A dword is 4 bytes, so in a dd pseudo-instruction with one operand, the address of the end is $+4. You could of course just put a label on the next line and use that.

earlyfunc:           ; before the call
    call func        ; let NASM calculate the offset
    db  0xE8
    dd  func - ($ + 4)       ; or do it ourselves
    db  0xE8
    dd  earlyfunc - ($ + 4)  ; and it still works for negative offsets

    ...

func:                ; after the call

反汇编输出(来自objdump -drwC -Mintel):

0000000000400080 <earlyfunc>:
  400080:       e8 34 00 00 00          call   4000b9 <func>    # encoded by NASM
  400085:       e8 2f 00 00 00          call   4000b9 <func>    # encoded manually
  40008a:       e8 f1 ff ff ff          call   400080 <earlyfunc>  # and backwards works too.

例如,如果偏移量错误,objdump 会将符号部分放置为 func+8.前 2 条 call 指令的相对位移相差 5,因为 call rel32 是 5 个字节长并且它们具有相同的实际目的地,不是相同的相对位移.请注意,反汇编程序负责将 rel32 添加到调用指令的地址以显示绝对目标地址.

If you get the offset wrong, objdump will put the symbolic part as func+8, for example. The relative displacement in the first 2 call instructions differs by 5 because call rel32 is 5 bytes long and they have the same actual destination, not the same relative displacement. Note that the disassembler takes care of adding the rel32 to the address of the call instructions to show you absolute destination addresses.

您可以使用db target - ($+1) 来编码短jmpjcc 的偏移量.(但要注意:db 0xEB, target - ($+1) 是不对的,因为当你同时输入操作码时,指令的结尾实际上是 $+2和位移作为同一 db 伪指令的多个参数.)

You can use db target - ($+1) to encode the offset for a short jmp or jcc. (But beware: db 0xEB, target - ($+1) isn't right, because the end of the instruction is actually $+2 when you put both the opcode and displacement as multiple args for the same db pseudo-instruction.)

相关:$$ 是当前部分的开始,所以 $ - $$ 是如何深入到您所在的当前部分.但这仅在当前文件中,因此链接两个将内容放入 .rodata 的文件与在同一源文件中具有两个 section .rodata 块不同.请参阅 nasm 中 $$ 的真正含义是什么.

Related: $$ is the start of the current section, so $ - $$ is how far into the current section you are. But this is only within the current file, so linking two files that put stuff in .rodata is different from having two section .rodata blocks in the same source file. See What's the real meaning of $$ in nasm.

到目前为止,最常见的用法是 times 510-($-$$) db 0/dw 0xAA55 填充(使用 db 0>) 一个引导扇区输出到 510 字节,然后添加引导扇区签名以生成 512 字节.(NASM 手册解释了这是如何工作的)

By far the most common use is times 510-($-$$) db 0 / dw 0xAA55 to pad (with db 0) a boot sector out to 510 bytes, and then add the boot sector signature to make 512 bytes. (The NASM manual explains how this works)

这篇关于$ 在 NASM 中究竟是如何工作的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆