破坏红色区域的内联程序集 [英] Inline assembly that clobbers the red zone

查看:21
本文介绍了破坏红色区域的内联程序集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个加密程序,核心(一个宽乘法例程)是用 x86-64 汇编编写的,既是为了速度,也是因为它广泛使用了像 adc 这样不容易的指令可以从 C 访问.我不想内联这个函数,因为它很大并且在内循环中被调用了几次.

I'm writing a cryptography program, and the core (a wide multiply routine) is written in x86-64 assembly, both for speed and because it extensively uses instructions like adc that are not easily accessible from C. I don't want to inline this function, because it's big and it's called several times in the inner loop.

理想情况下,我还想为此函数定义自定义调用约定,因为它在内部使用所有寄存器(rsp 除外),不会破坏其参数,并在寄存器中返回.目前,它已适应 C 调用约定,但这当然会使其变慢(大约 10%).

Ideally I would also like to define a custom calling convention for this function, because internally it uses all the registers (except rsp), doesn't clobber its arguments, and returns in registers. Right now, it's adapted to the C calling convention, but of course this makes it slower (by about 10%).

为了避免这种情况,我可以用 asm("call %Pn" : ... : my_function... : "cc", all the registers); 调用它,但是有没有办法告诉 GCC 调用指令与堆栈混淆?否则 GCC 只会将所有这些寄存器放在红色区域中,而顶部的寄存器将被破坏.我可以用 -mno-red-zone 编译整个模块,但我更喜欢告诉 GCC,比如说,红色区域的前 8 个字节将被破坏,这样它就不会在那里放任何东西.

To avoid this, I can call it with asm("call %Pn" : ... : my_function... : "cc", all the registers); but is there a way to tell GCC that the call instruction messes with the stack? Otherwise GCC will just put all those registers in the red zone, and the top one will get clobbered. I can compile the whole module with -mno-red-zone, but I'd prefer a way to tell GCC that, say, the top 8 bytes of the red zone will be clobbered so that it won't put anything there.

推荐答案

从你原来的问题我没有意识到 gcc 对叶函数的限制红区使用.我认为 x86_64 ABI 不需要这样做,但对于编译器来说这是一个合理的简化假设.在这种情况下,您只需要将调用汇编例程的函数设为非叶函数以进行编译:

From your original question I did not realize gcc limited red-zone use to leaf functions. I don't think that's required by the x86_64 ABI, but it is a reasonable simplifying assumption for a compiler. In that case you only need to make the function calling your assembly routine a non-leaf for purposes of compilation:

int global;

was_leaf()
{
    if (global) other();
}

GCC 无法判断 global 是否为真,因此它无法优化掉对 other() 的调用,所以 was_leaf() 不再是叶函数.我编译了这个(使用更多触发堆栈使用的代码)并观察到它作为叶它没有移动 %rsp 并且修改显示它确实如此.

GCC can't tell if global will be true, so it can't optimize away the call to other() so was_leaf() is not a leaf function anymore. I compiled this (with more code that triggered stack usage) and observed that as a leaf it did not move %rsp and with the modification shown it did.

我还尝试在叶子中简单地分配超过 128 个字节(仅 char buf[150]),但我震惊地看到它只做了部分减法:

I also tried simply allocating more than 128 bytes (just char buf[150]) in a leaf but I was shocked to see it only did a partial subtraction:

    pushq   %rbp
    movq    %rsp, %rbp
    subq    $40, %rsp
    movb    $7, -155(%rbp)

如果我把破坏叶子的代码放回去就变成 subq $160, %rsp

If I put the leaf-defeating code back in that becomes subq $160, %rsp

这篇关于破坏红色区域的内联程序集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆