从 x86-64 打印浮点数似乎需要保存 %rbp [英] Printing floating point numbers from x86-64 seems to require %rbp to be saved

查看:24
本文介绍了从 x86-64 打印浮点数似乎需要保存 %rbp的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在 Ubuntu 上使用 gcc 4.6.1 编写一个与 C 库链接的简单汇编语言程序,并尝试打印一个整数时,它工作正常:

When I write a simple assembly language program, linked with the C library, using gcc 4.6.1 on Ubuntu, and I try to print an integer, it works fine:

        .global main
        .text
main:
        mov     $format, %rdi
        mov     $5, %rsi
        mov     $0, %rax
        call    printf
        ret
format:
        .asciz  "%10d
"

这会按预期打印 5.

但是现在如果我做一个小的改变,并尝试打印一个浮点值:

But now if I make a small change, and try to print a floating point value:

        .global main
        .text
main:
        mov     $format, %rdi
        movsd   x, %xmm0
        mov     $1, %rax
        call    printf
        ret
format:
        .asciz  "%10.4f
"
x:
        .double 15.5

这个程序段错误没有打印任何内容.只是一个可悲的段错误.

This program seg faults without printing anything. Just a sad segfault.

但我可以通过推送和弹出 %rbp 来解决这个问题.

But I can fix this by pushing and popping %rbp.

        .global main
        .text
main:
        push    %rbp
        mov     $format, %rdi
        movsd   x, %xmm0
        mov     $1, %rax
        call    printf
        pop     %rbp
        ret
format:
        .asciz  "%10.4f
"
x:
        .double 15.5

现在它可以工作了,并打印了 15.5000.

Now it works, and prints 15.5000.

我的问题是:为什么推送和弹出 %rbp 会使应用程序工作?根据 ABI,%rbp 是被调用者 必须 保留的寄存器之一,因此 printf 不能弄乱它.事实上,printf 在第一个程序中起作用,当时只有一个整数被传递给 printf.所以问题一定出在其他地方?

My question is: why did pushing and popping %rbp make the application work? According to the ABI, %rbp is one of the registers that the callee must preserve, and so printf cannot be messing it up. In fact, printf worked in the first program, when only an integer was passed to printf. So the problem must be elsewhere?

推荐答案

我怀疑问题与 %rbp 无关,而与堆栈对齐有关.引用 ABI:

I suspect the problem doesn't have anything to do with %rbp, but rather has to do with stack alignment. To quote the ABI:

ABI 要求堆栈帧在 16 字节边界上对齐.具体来说,结束参数区域 (%rbp+16) 必须是 16 的倍数.这个要求意味着帧size 应填充为 16 字节的倍数.

The ABI requires that stack frames be aligned on 16-byte boundaries. Specifically, the end of the argument area (%rbp+16) must be a multiple of 16. This requirement means that the frame size should be padded out to a multiple of 16 bytes.

当你输入main()时,栈是对齐的.调用 printf() 将返回地址压入堆栈,将堆栈指针移动 8 个字节.您可以通过将另外 8 个字节推入堆栈来恢复对齐(这恰好是 %rbp,但也可以很容易地成为其他东西).

The stack is aligned when you enter main(). Calling printf() pushes the return address onto the stack, moving the stack pointer by 8 bytes. You restore the alignment by pushing another eight bytes onto the stack (which happen to be %rbp but could just as easily be something else).

这是 gcc 生成的代码(还有 在 Godbolt 编译器浏览器上):

Here is the code that gcc generates (also on the Godbolt compiler explorer):

.LC1:
        .ascii "%10.4f12"
main:
        leaq    .LC1(%rip), %rdi   # format string address
        subq    $8, %rsp           ### align the stack by 16 before a CALL
        movl    $1, %eax           ### 1 FP arg being passed in a register to a variadic function
        movsd   .LC0(%rip), %xmm0  # load the double itself
        call    printf
        xorl    %eax, %eax         # return 0 from main
        addq    $8, %rsp
        ret

如您所见,它通过从开头的 %rsp 中减去 8 并在末尾添加它来处理对齐要求.

As you can see, it deals with the alignment requirements by subtracting 8 from %rsp at the start, and adding it back at the end.

您可以改为对您喜欢的任何寄存器进行虚拟推送/弹出,而不是直接操作 %rsp一些编译器确实使用了一个虚拟的推动对齐堆栈 因为这实际上可以是在现代 CPU 上更便宜,并节省代码大小.

You could instead do a dummy push/pop of whatever register you like instead of manipulating %rsp directly; some compilers do use a dummy push to align the stack because this can actually be cheaper on modern CPUs, and saves code size.

这篇关于从 x86-64 打印浮点数似乎需要保存 %rbp的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆