这个汇编函数调用安全/完整吗? [英] Is this assembly function call safe/complete?

查看:27
本文介绍了这个汇编函数调用安全/完整吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我没有组装经验,但这是我一直在研究的.如果我缺少传递参数和通过汇编中的指针调用函数的任何基本方面,我想要输入.

例如我想知道我是否应该恢复ecxedxesiedi.我读到它们是通用寄存器,但我找不到它们是否需要恢复?打电话后我应该做些什么清理工作?

这是我现在拥有的代码,它确实有效:

#include "stdio.h"void foo(int a, int b, int c, int d){printf("values = %d and %d and %d and %d
", a, b, c, d);}int main(){int a=3,b=6,c=9,d=12;__asm__(移动 %3,%%ecx;"移动 %2,%%edx;"移动 %1,%%esi;"移动%0,%%edi;"呼叫%4;":: "g"(a), "g"(b), "g"(c), "g"(d), "a"(foo));}

解决方案

最初的问题是这个汇编函数调用安全/完整吗?.答案是:不.虽然它在这个简单的例子中似乎有效(尤其是在禁用优化的情况下),但您违反了最终会导致失败的规则(那些真的很难追踪).

我想解决有关如何使其安全的(显而易见的)后续问题,但如果没有来自 OP 对实际意图的反馈,我真的无法做到这一点.

所以,我会尽我所能利用我们拥有的一切,并尝试描述使其不安全的原因以及您可以采取的一些措施.

让我们从简化 asm 开始:

 __asm__(移动%0,%%edi;":: "g"(a));

即使有这一条语句,这段代码也已经不安全了.为什么?因为我们在不让编译器知道的情况下更改了寄存器 (edi) 的值.

你问的编译器怎么会不知道?毕竟,它就在 asm 中!答案来自 gcc docs 中的这一行:><块引用>

GCC 本身不会解析汇编指令,也不会知道它们是什么意思,甚至它们是否是有效的汇编输入.

在那种情况下,你如何让 gcc 知道发生了什么?答案在于使用约束(冒号后面的内容)来描述 asm 的影响.

也许修复此代码的最简单方法是这样的:

 __asm__(移动%0,%%edi;":: "g"(a): edi);

这会将 edi 添加到 clobber 列表.简而言之,这告诉 gcc edi 的值将被代码更改,并且当 asm 退出时,gcc 不应假设其中包含任何特定值.

现在,虽然这是最简单的方法,但不一定是最好的方法.考虑这个代码:

 __asm__("::D"(一));

这使用 机器约束 告诉 gcc 放置值将变量 a 放入 edi 寄存器中.这样做,gcc 将在方便"的时间为您加载寄存器,也许是通过始终将 a 保留在 edi 中.

此代码有一个(重要的)警告:通过将参数放在第二个冒号之后,我们将其声明为输入.输入参数必须是只读的(即它们在退出 asm 时必须具有相同的值).

在您的情况下,call 语句意味着我们无法保证 edi 不会被更改,因此这不太有效.有几种方法可以解决这个问题.最简单的方法是在第一个冒号之后将约束向上移动,使其成为输出,并指定 "+D" 以指示值是读+写.但是,a 的内容在 asm 之后几乎是未定义的(printf 可以将其设置为任何内容).如果销毁 a 是不可接受的,那么总会有这样的事情:

int 垃圾;__asm__ 易失性 (":=D"(垃圾):0"(一));

这告诉 gcc 在启动 asm 时,它应该将变量 a 的值放入与输出约束 #0(即 edi)相同的位置.它还说在输出时,edi 将不再是 a,它将包含变量 junk.

由于实际上不会使用垃圾"变量,因此我们需要添加 volatile 限定符.当没有任何输出参数时,Volatile 是隐式的.

该行的另一点:以分号结束.这是合法的,将按预期工作.但是,如果您想使用 -S 命令行选项准确查看生成的代码(如果您想使用内联 asm,您会的),您会发现生成困难- 阅读代码.我建议使用 而不是分号.

所有这些,我们仍然在第一行......

显然同样适用于其他两个 mov 语句.

这将我们带到了 call 语句.

Michael 和我都列出了在内联汇编中调用很困难的一些原因.

  • 处理可能被函数调用的 ABI 破坏的所有寄存器.
  • 处理红区.
  • 处理对齐.
  • 内存破坏器.

如果这里的目标是学习",那么请随意尝试.但我不知道在生产代码中这样做会不会让我感到自在.即使它看起来有效,我也永远不会相信我没有遗漏一些奇怪的案例.除了我对 完全使用内联 asm 的正常担忧之外,这还没有.

我知道,这是很多信息.作为 gcc asm 命令的介绍,可能比您想要的要多,但您选择了一个具有挑战性的起点.

如果您还没有这样做,请花时间查看 gcc 的 汇编语言界面.那里有很多很好的信息和示例,试图解释它是如何工作的.

I don't have experience in assembly, but this is what I've been working on. I would like input if I'm missing any fundamental aspects to passing parameters and calling a function via pointer in assembly.

For instance I'm wondering if I supposed to restore ecx, edx, esi, edi. I read they are general purpose registers, but I couldn't find if they need to be restored? Is there any kind of cleanup I am supposed to do after a call?

This is the code I have now, and it does work:

#include "stdio.h"

void foo(int a, int b, int c, int d)
{
  printf("values = %d and %d and %d and %d
", a, b, c, d);
}

int main()
{

  int a=3,b=6,c=9,d=12;
  __asm__(
          "mov %3, %%ecx;"
          "mov %2, %%edx;"
          "mov %1, %%esi;"
          "mov %0, %%edi;"
          "call %4;"
          :
          : "g"(a), "g"(b), "g"(c), "g"(d), "a"(foo)
          );

}

解决方案

The original question was Is this assembly function call safe/complete?. The answer to that is: no. While it may appear to work in this simple example (especially if optimizations are disabled), you are violating rules that will eventually lead to failures (ones that are really hard to track down).

I'd like to address the (obvious) followup question of how to make it safe, but without feedback from the OP on the actual intent, I can't really do that.

So, I'll do the best I can with what we have and try to describe the things that make it unsafe and some of the things you can do about it.

Let's start by simplifying that asm:

 __asm__(
          "mov %0, %%edi;"
          :
          : "g"(a)
          );

Even with this single statement, this code is already unsafe. Why? Because we are changing the value of a register (edi) without letting the compiler know.

How can the compiler not know you ask? After all, it's right there in the asm! The answer comes from this line in the gcc docs:

GCC does not parse the assembler instructions themselves and does not know what they mean or even whether they are valid assembler input.

In that case, how do you let gcc know what's going on? The answer lies in using the constraints (the stuff after the colons) to describe the impact of the asm.

Perhaps the simplest way to fix this code would be like this:

  __asm__(
          "mov %0, %%edi;"
          :
          : "g"(a)
          : edi
          );

This adds edi to the clobber list. In brief, this tell gcc that the value of edi is going to be changed by the code, and that gcc shouldn't assume any particular value will be in it when the asm exits.

Now, while that's the easiest, it's not necessarily the best way. Consider this code:

  __asm__(
          ""
          :
          : "D"(a)
          );

This uses a machine constraint to tell gcc to put the value of the variable a into the edi register for you. Doing it this way, gcc will load the register for you at a 'convenient' time, perhaps by always keeping a in edi.

There is one (significant) caveat to this code: By putting the parameter after the 2nd colon, we are declaring it to be an input. Input parameters are required to be read-only (ie they must have the same value on exiting the asm).

In your case, the call statement means that we won't be able to guarantee that edi won't be changed, so this doesn't quite work. There are a few ways to deal with this. The easiest is to move the constraint up after the first colon, making it an output, and specify "+D" to indicate that the value is read+write. But then the contents of a are going to be pretty much undefined after the asm (printf could set it to anything). If destroying a is unacceptable, there's always something like this:

int junk;
  __asm__ volatile (
          ""
          : "=D" (junk)
          : "0"(a)
          );

This tells gcc that on starting the asm, it should put the value of the variable a into the same place as output constraint #0 (ie edi). It also says that on output, edi won't be a anymore, it will contain the variable junk.

Edit: Since the 'junk' variable isn't actually going to be used, we need to add the volatile qualifier. Volatile was implicit when there weren't any output parameters.

One other point on that line: You end it with a semi-colon. This is legal and will work as expected. However, if you ever want to use the -S command line option to see exactly what code got generated (and if you want to get good with inline asm, you will), you will find that produces difficult-to-read code. I'd recommend using instead of a semi-colon.

All that and we're still on the first line...

Obviously the same would apply to the other two mov statements.

Which brings us to the call statement.

Both Michael and I have listed a number of reasons doing call in inline asm is difficult.

  • Handling all the registers that may be clobbered by the function call's ABI.
  • Handling red-zone.
  • Handling alignment.
  • Memory clobber.

If the goal here is 'learning,' then feel free to experiment. But I don't know that I would ever feel comfortable doing this in production code. Even when it looks like it works, I'd never feel confident there wasn't some weird case I'd missed. That's aside from my normal concerns about using inline asm at all.

I know, that's a lot of information. Probably more than you were looking for as an introduction to gcc's asm command, but you've picked a challenging place to start.

If you haven't done so already, spend time looking over all the docs in gcc's Assembly Language interface. There's a lot of good information there along with examples to try to explain how it all works.

这篇关于这个汇编函数调用安全/完整吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆