在函数内使用 DB(定义字节)时出现分段错误 [英] Segmentation fault when using DB (define byte) inside a function

查看：24 发布时间：2021/12/18 9:11:51 assembly x86-64 nasm machine-code

本文介绍了在函数内使用 DB(定义字节)时出现分段错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试在我的 .text 部分中用汇编语言定义一个字节.我知道数据应该转到 .data 部分，但我想知道为什么在我这样做时它会给我一个分段错误.如果我在 .data 中定义字节，它不会给我任何错误，与 .text 不同.我正在使用运行 Mint 19.1 的 Linux 机器并使用 NASM + LD 来编译和链接可执行文件.

I'm trying to define a byte in Assembly language inside my .text section. I know data should go to the .data section but I was wondering why it gives me a segmentation fault when I do it. If I define the byte inside .data, it doesn't give me any errors, unlike .text. I am using a Linux machine running Mint 19.1 and using NASM + LD to compile and link the executable.

这在没有分段错误的情况下运行:

This runs without segmentation faults:

global _start
section .data
db 0x41
section .text
_start:
    mov rax, 60    ; Exit(0) syscall
    xor rdi, rdi
    syscall

这给了我一个段错误:

global _start
section .text
_start:
    db 0x41
    mov rax, 60     ; Exit(0) syscall
    xor rdi, rdi
    syscall

我正在使用以下脚本来编译和链接它:

I'm using the following script to compile and link it:

nasm -felf64 main.s -o main.o
ld main.o -o main

我希望程序可以正常工作而不会出现任何分段错误，但是当我在 .text 中使用 DB 时却不会.我怀疑 .text 是只读的，这可能是这个问题的原因，我说得对吗?有人可以向我解释为什么我的第二个代码示例出现段错误吗?

I expect the program to work without any segmentation faults, but it doesn't when I use DB inside .text. I suspect that .text is readonly and that may be the reason of this problem, am I correct? Can someone explain to me why my second code example segfaults?

推荐答案

如果你告诉汇编器在某处汇编任意字节，它会的.db 是一个发出字节的伪指令，所以 mov eax, 60 和 db 0xb8, 0x3c, 0, 0, 0 是完全等价的就 NASM 而言.任何一个都会将这 5 个字节发送到当前位置的输出中.

If you tell the assembler to assemble arbitrary bytes somewhere, it will. db is a pseudo-instruction that emits bytes, so mov eax, 60 and db 0xb8, 0x3c, 0, 0, 0 are exactly equivalent as far as NASM is concerned. Either one will emit those 5 bytes into the output at the current position.

如果您不希望将数据解码为(部分)指令，请不要将其放在执行会到达的地方.

由于您使用的是 NASM¹，它会将 mov rax,60 优化为 mov eax,60，因此指令不会具有您期望从源中获得的 REX 前缀.

Since you're using NASM¹, it optimizes mov rax,60 into mov eax,60, so the instruction doesn't have the REX prefix you'd expect from the source.

您手动编码的 mov 的 REX 前缀将其更改为 mov 到 R8D 而不是 EAX:
41 b8 3c 00 00 00 mov r8d,0x3c

Your manually-encoded REX prefix for mov changes it into a mov to R8D instead of EAX:
41 b8 3c 00 00 00 mov r8d,0x3c

(我检查了 objdump -drwC -Mintel 而不是寻找上哪位是 REX 前缀中的哪一个.我只记得 REX.W 是 0x48.但是 0x41 是 x86-64 中的 REX.B 前缀).

(I checked with objdump -drwC -Mintel instead of looking up which bit is which in the REX prefix. I only remember that REX.W is 0x48. But 0x41 is a REX.B prefix in x86-64).

因此，不是进行 sys_exit 系统调用，您的代码以 EAX=0 运行 syscall，即 __NR_read.(Linux 内核在进程启动之前将 RSP 以外的所有寄存器清零，并且在静态链接的可执行文件中，_start 是真正的入口点，没有首先运行动态链接器代码.因此 RAX 仍然为零).

So instead of making a sys_exit system call, your code runs syscall with EAX=0, which is __NR_read. (The Linux kernel zeros all the registers other than RSP before process startup, and in a statically-linked executable, _start is the true entry point with no dynamic linker code running first. So RAX is still zero).

$ strace ./rex execve("./rex", ["./rex"], 0x7fffbbadad60 /* 54 vars */) = 0 read(0, NULL, 0) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=NULL} --- +++ killed by SIGSEGV (core dumped) +++

然后执行落入after syscall，在这种情况下是00 00 字节，解码为<代码>添加 [rax]、al，从而导致段错误.如果您在 GDB 中运行代码，您就会看到这一点.

And then execution falls through into whatever is after syscall, which in this case is 00 00 bytes that decode as add [rax], al, and thus segfault. You would have seen this if you'd run your code inside GDB.

脚注 1:如果您使用了未优化为 32 位操作数大小的 YASM:

Intel 的手册说在一条指令上有 2 个 REX 前缀是非法的.我预计会出现非法指令错误(#UD 机器异常 -> 内核提供 SIGILL)，但我的 Skylake CPU 忽略了第一个 REX 前缀并将其解码为 mov rax, sign_extended_imm32.

Intel's manuals say that it's illegal to have 2 REX prefixes on one instruction. I expected an illegal-instruction fault (#UD machine exception -> kernel delivers SIGILL), but my Skylake CPU ignores the first REX prefix and decodes it as mov rax, sign_extended_imm32.

单步执行，它被视为一个长指令，所以我猜Skylake选择像其他多个前缀的情况一样处理它，其中只有最后一个类型有效果.(但请记住，这不是面向未来的，其他 x86 CPU 可以以不同的方式处理它.)

Single-stepping, it's treated as one long instructions, so I guess Skylake chooses to handle it like other cases of multiple prefixes, where only the last one of a type has an effect. (But remember this is not future-proof, other x86 CPUs could handle it differently.)

其他情况下的相关/相同错误:

Related / same bug in other situations:

组装(x86):<标签>db 'string',0 不会被执行，除非 BIOS MBR 引导扇区中有跳转指令
跳过未知操作码:66，而非 8086 指令- EMU8086 中尚不支持

这篇关于在函数内使用 DB(定义字节)时出现分段错误的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在函数内使用 DB(定义字节)时出现分段错误 [英] Segmentation fault when using DB (define byte) inside a function

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在函数内使用 DB(定义字节)时出现分段错误 [英] Segmentation fault when using DB (define byte) inside a function

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭