从不对齐RSP的函数调用时,glibc scanf分段错误 [英] glibc scanf Segmentation faults when called from a function that doesn't align RSP

查看:91
本文介绍了从不对齐RSP的函数调用时,glibc scanf分段错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在编译以下代码时:

global main
extern printf, scanf

section .data
   msg: db "Enter a number: ",10,0
   format:db "%d",0

section .bss
   number resb 4

section .text
main:
   mov rdi, msg
   mov al, 0
   call printf

   mov rsi, number
   mov rdi, format
   mov al, 0
   call scanf

   mov rdi,format
   mov rsi,[number]
   inc rsi
   mov rax,0
   call printf 

   ret

使用:

nasm -f elf64 example.asm -o example.o
gcc -no-pie -m64 example.o -o example

然后运行

./example

运行,打印:输入数字: 但随后崩溃并打印: 分段错误(核心已转储)

it runs, print: enter a number: but then crashes and prints: Segmentation fault (core dumped)

所以printf可以正常工作,而scanf则不能. 我用scanf怎么了?

So printf works fine but scanf not. What am I doing wrong with scanf so?

推荐答案

在函数的开始/结尾使用sub rsp, 8/add rsp, 8 将堆栈重新对齐为16个字节,然后再对齐您的函数执行call.

或者更好地压入/弹出虚拟寄存器,例如push rdx/pop rcx,或保存/恢复调用保留的寄存器,如RBP. 您需要对RSP进行的总更改为8的奇数倍,包括所有推动和sub rsp.

Or better push/pop a dummy register, e.g. push rdx / pop rcx, or save/restore a call-preserved register like RBP. You need the total change to RSP to be an odd multiple of 8 counting all pushes and sub rsp.

在函数输入中,由于call推送了8字节的返回地址,因此RSP与16字节的对齐方式相距8字节.请参阅打印浮点x86-64中的数字似乎需要保存%rbp 主机和堆栈对齐,以及

On function entry, RSP is 8 bytes away from 16-byte alignment because the call pushed an 8-byte return address. See Printing floating point numbers from x86-64 seems to require %rbp to be saved, main and stack alignment, and Calling printf in x86_64 using GNU assembler. This is an ABI requirement which you used to be able to get away with violating when there weren't any FP args for printf. But not any more.

gcc的glibc scanf的代码生成现在取决于16字节堆栈对齐方式
即使AL == 0
.

gcc's code-gen for glibc scanf now depends on 16-byte stack alignment
even when AL == 0
.

似乎在__GI__IO_vfscanf中某个地方进行了自动向量化复制16个字节,常规scanf在将其寄存器args溢出到堆栈 1 后会进行常规调用. (调用scanf的许多类似方法在一个大型实现中作为scanffscanf等各种libc入口点的后端共享)

It seems to have auto-vectorized copying 16 bytes somewhere in __GI__IO_vfscanf, which regular scanf calls after spilling its register args to the stack1. (The many similar ways to call scanf share one big implementation as a back end to the various libc entry points like scanf, fscanf, etc.)

我下载了Ubuntu 18.04的libc6二进制软件包: https://packages.ubuntu .com/bionic/amd64/libc6/download 并提取文件(使用7z x blah.debtar xf data.tar,因为7z知道如何提取很多文件格式).

I downloaded Ubuntu 18.04's libc6 binary package: https://packages.ubuntu.com/bionic/amd64/libc6/download and extracted the files (with 7z x blah.deb and tar xf data.tar, because 7z knows how to extract a lot of file formats).

我可以使用LD_LIBRARY_PATH=/tmp/bionic-libc/lib/x86_64-linux-gnu ./bad-printf来复制您的错误,事实证明,我的Arch Linux桌面上的系统glibc 2.27-3也是如此.

I can repro your bug with LD_LIBRARY_PATH=/tmp/bionic-libc/lib/x86_64-linux-gnu ./bad-printf, and also it turns out with the system glibc 2.27-3 on my Arch Linux desktop.

使用GDB,我在您的程序上运行它,先执行set env LD_LIBRARY_PATH /tmp/bionic-libc/lib/x86_64-linux-gnu,然后执行run.对于layout reg,反汇编窗口在收到SIGSEGV时如下所示:

With GDB, I ran it on your program and did set env LD_LIBRARY_PATH /tmp/bionic-libc/lib/x86_64-linux-gnu then run. With layout reg, the disassembly window looks like this at the point where it received SIGSEGV:

   │0x7ffff786b49a <_IO_vfscanf+602>        cmp    r12b,0x25                                                                                             │
   │0x7ffff786b49e <_IO_vfscanf+606>        jne    0x7ffff786b3ff <_IO_vfscanf+447>                                                                      │
   │0x7ffff786b4a4 <_IO_vfscanf+612>        mov    rax,QWORD PTR [rbp-0x460]                                                                             │
   │0x7ffff786b4ab <_IO_vfscanf+619>        add    rax,QWORD PTR [rbp-0x458]                                                                             │
   │0x7ffff786b4b2 <_IO_vfscanf+626>        movq   xmm0,QWORD PTR [rbp-0x460]                                                                            │
   │0x7ffff786b4ba <_IO_vfscanf+634>        mov    DWORD PTR [rbp-0x678],0x0                                                                             │
   │0x7ffff786b4c4 <_IO_vfscanf+644>        mov    QWORD PTR [rbp-0x608],rax                                                                             │
   │0x7ffff786b4cb <_IO_vfscanf+651>        movzx  eax,BYTE PTR [rbx+0x1]                                                                                │
   │0x7ffff786b4cf <_IO_vfscanf+655>        movhps xmm0,QWORD PTR [rbp-0x608]                                                                            │
  >│0x7ffff786b4d6 <_IO_vfscanf+662>        movaps XMMWORD PTR [rbp-0x470],xmm0                                                                          │

因此,它使用movq + movhps加载并movaps存储将两个8字节对象复制到堆栈.但是由于堆栈未对齐,导致movaps [rbp-0x470],xmm0出现故障.

So it copied two 8-byte objects to the stack with movq + movhps to load and movaps to store. But with the stack misaligned, movaps [rbp-0x470],xmm0 faults.

我没有抓住调试版本来确切地找出C源代码的哪一部分变成了这个,但是该函数是用C编写的,并由启用了优化的GCC编译.一直允许GCC这样做,但直到最近才使GCC变得足够聪明,可以通过这种方式更好地利用SSE2.

I didn't grab a debug build to find out exactly which part of the C source turned into this, but the function is written in C and compiled by GCC with optimization enabled. GCC has always been allowed to do this, but only recently did it get smart enough to take better advantage of SSE2 this way.

脚注1:带有AL != 0的printf/scanf始终需要16字节对齐,因为gcc可变参数功能的代码生成使用测试al,al/je来溢出完整的16字节XMM regs xmm0..7并对齐.在这种情况下存储. __m128i可以是可变参数函数的参数,而不仅仅是double,并且gcc不会检查该函数是否实际上读取了任何16字节FP args.

Footnote 1: printf / scanf with AL != 0 has always required 16-byte alignment because gcc's code-gen for variadic functions uses test al,al / je to spill the full 16-byte XMM regs xmm0..7 with aligned stores in that case. __m128i can be an argument to a variadic function, not just double, and gcc doesn't check whether the function ever actually reads any 16-byte FP args.

这篇关于从不对齐RSP的函数调用时,glibc scanf分段错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆