从x86-64打印浮点数似乎需要保存%rbp [英] Printing floating point numbers from x86-64 seems to require %rbp to be saved

查看:106
本文介绍了从x86-64打印浮点数似乎需要保存%rbp的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在Ubuntu上使用gcc 4.6.1编写一个与C库链接的简单汇编语言程序时,我尝试打印一个整数,效果很好:

When I write a simple assembly language program, linked with the C library, using gcc 4.6.1 on Ubuntu, and I try to print an integer, it works fine:

        .global main
        .text
main:
        mov     $format, %rdi
        mov     $5, %rsi
        mov     $0, %rax
        call    printf
        ret
format:
        .asciz  "%10d\n"

这将按预期打印5张.

但是现在,如果我做了一些小的更改,并尝试打印浮点值:

But now if I make a small change, and try to print a floating point value:

        .global main
        .text
main:
        mov     $format, %rdi
        movsd   x, %xmm0
        mov     $1, %rax
        call    printf
        ret
format:
        .asciz  "%10.4f\n"
x:
        .double 15.5

该程序段错误不打印任何内容.只是一个令人伤心的段错误.

This program seg faults without printing anything. Just a sad segfault.

但是我可以通过按下并弹出%rbp来解决此问题.

But I can fix this by pushing and popping %rbp.

        .global main
        .text
main:
        push    %rbp
        mov     $format, %rdi
        movsd   x, %xmm0
        mov     $1, %rax
        call    printf
        pop     %rbp
        ret
format:
        .asciz  "%10.4f\n"
x:
        .double 15.5

现在它可以工作了,并且可以打印15.5000.

Now it works, and prints 15.5000.

我的问题是:为什么按下并弹出%rbp可使应用程序正常工作?根据ABI的说法,%rbp是被调用方必须保留的寄存器之一,因此printf不能将其弄乱.实际上,当只有一个整数传递给printf时,printf在第一个程序中起作用.那么问题一定在其他地方吗?

My question is: why did pushing and popping %rbp make the application work? According to the ABI, %rbp is one of the registers that the callee must preserve, and so printf cannot be messing it up. In fact, printf worked in the first program, when only an integer was passed to printf. So the problem must be elsewhere?

推荐答案

我怀疑问题与%rbp无关,而与堆栈对齐有关.引用ABI:

I suspect the problem doesn't have anything to do with %rbp, but rather has to do with stack alignment. To quote the ABI:

ABI要求堆栈帧必须在16字节边界上对齐.具体来说, 参数区域(%rbp + 16)必须为16的倍数.此要求意味着帧 大小应填充为16字节的倍数.

The ABI requires that stack frames be aligned on 16-byte boundaries. Specifically, the end of the argument area (%rbp+16) must be a multiple of 16. This requirement means that the frame size should be padded out to a multiple of 16 bytes.

输入main()时,堆栈已对齐.调用printf()会将返回地址压入堆栈,将堆栈指针移动8个字节.您可以通过将另外八个字节压入堆栈来恢复对齐状态(碰巧是%rbp,但也很可能是其他字节).

The stack is aligned when you enter main(). Calling printf() pushes the return address onto the stack, moving the stack pointer by 8 bytes. You restore the alignment by pushing another eight bytes onto the stack (which happen to be %rbp but could just as easily be something else).

这是gcc生成的代码(也是在Godbolt编译器浏览器上):

Here is the code that gcc generates (also on the Godbolt compiler explorer):

.LC1:
        .ascii "%10.4f\12\0"
main:
        leaq    .LC1(%rip), %rdi   # format string address
        subq    $8, %rsp           ### align the stack by 16 before a CALL
        movl    $1, %eax           ### 1 FP arg being passed in a register to a variadic function
        movsd   .LC0(%rip), %xmm0  # load the double itself
        call    printf
        xorl    %eax, %eax         # return 0 from main
        addq    $8, %rsp
        ret

如您所见,它通过从%rsp开始减去8,然后在末尾加回去来处理对齐要求.

As you can see, it deals with the alignment requirements by subtracting 8 from %rsp at the start, and adding it back at the end.

您可以对任何您喜欢的寄存器执行虚拟推入/弹出操作,而不是直接操纵%rsp一些编译器的确使用了虚拟对象推送以对齐堆栈,因为这实际上可能是在现代CPU上更便宜,并节省了代码大小.

You could instead do a dummy push/pop of whatever register you like instead of manipulating %rsp directly; some compilers do use a dummy push to align the stack because this can actually be cheaper on modern CPUs, and saves code size.

这篇关于从x86-64打印浮点数似乎需要保存%rbp的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆