该则会覆盖红色区域内联汇编 [英] Inline assembly that clobbers the red zone

查看:176
本文介绍了该则会覆盖红色区域内联汇编的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在写一个加密程序,核心(宽乘法例程)写在x86-64的组装,无论是速度,因为它广泛使用像 ADC 不是来自C.方便我不想内联这个功能,因为它大,它被称为多次在内部循环。

I'm writing a cryptography program, and the core (a wide multiply routine) is written in x86-64 assembly, both for speed and because it extensively uses instructions like adc that are not easily accessible from C. I don't want to inline this function, because it's big and it's called several times in the inner loop.

理想我也想定义一个自定义调用约定这一功能,因为它在内部使用的所有寄存器(除 RSP ),不揍它的参数,和寄存器的回报。眼下,它适应了C调用约定,当然,这使得它更慢(约10%)。

Ideally I would also like to define a custom calling convention for this function, because internally it uses all the registers (except rsp), doesn't clobber its arguments, and returns in registers. Right now, it's adapted to the C calling convention, but of course this makes it slower (by about 10%).

要避免这种情况,我可以用称之为ASM(呼叫%的Pn:...:my_function ......CC,所有的寄存器); 但有一种方法来告诉GCC呼叫指令与堆栈食堂?否则,GCC将只是把红区所有这些寄存器,顶一个将得到重挫。我可以编译-mno红色区域的整个模块,但我preFER一种方法来告诉GCC的,比方说,前8个字节的红色区域将被打一顿,这样就不会放任何东西那里。

To avoid this, I can call it with asm("call %Pn" : ... : my_function... : "cc", all the registers); but is there a way to tell GCC that the call instruction messes with the stack? Otherwise GCC will just put all those registers in the red zone, and the top one will get clobbered. I can compile the whole module with -mno-red-zone, but I'd prefer a way to tell GCC that, say, the top 8 bytes of the red zone will be clobbered so that it won't put anything there.

推荐答案

从你原来的问题我不知道GCC局限于红色区域使用叶的功能。我不认为这是由ABI x86_64的要求,但它是一个编译器一个合理的简化假设。在这种情况下,你只需要做的函数调用您的汇编程序的非叶编译的目的:

From your original question I did not realize gcc limited red-zone use to leaf functions. I don't think that's required by the x86_64 ABI, but it is a reasonable simplifying assumption for a compiler. In that case you only need to make the function calling your assembly routine a non-leaf for purposes of compilation:

int global;

was_leaf()
{
    if (global) other();
}

GCC不能判断全球将是真实的,所以它不能优化掉调用其他()所以 was_leaf()不是叶函数了。我编这个(更多code触发堆栈使用),并观察到,作为一个叶子它没有移动%RSP 和显示的修改的确如此。

GCC can't tell if global will be true, so it can't optimize away the call to other() so was_leaf() is not a leaf function anymore. I compiled this (with more code that triggered stack usage) and observed that as a leaf it did not move %rsp and with the modification shown it did.

我也试过只是拨出超过128字节(只字符BUF [150] )的叶子,但我很震惊地看到它只是做了局部加减:

I also tried simply allocating more than 128 bytes (just char buf[150]) in a leaf but I was shocked to see it only did a partial subtraction:

    pushq   %rbp
    movq    %rsp, %rbp
    subq    $40, %rsp
    movb    $7, -155(%rbp)

如果我把叶击败code回在成为 SUBQ $ 160%RSP

If I put the leaf-defeating code back in that becomes subq $160, %rsp

这篇关于该则会覆盖红色区域内联汇编的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆