如何将C结构传递给汇编中的函数? [英] How C structures get passed to function in assembly?

查看:92
本文介绍了如何将C结构传递给汇编中的函数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

1)如何将C结构传递给汇编中的函数.我的意思是按值传递,而不是按引用传递. 2)顺便说一下,被调用方如何将结构返回给其调用方? 由于我的母语不是英语,所以我为自己的表现不好而感到抱歉.

1)How C structures get passed to function in assembly. I mean pass by value, not pass by reference. 2)By the way, how callees return structure to its callers? I'm so sorry for the poor expression since I'm not a native English speaker.

我编写了一个简单的程序来验证C结构如何传递给函数.但是结果令人惊讶.寄存器传递了一些值,但将它们压入堆栈则传递了一些值.这是代码.

I wrote a simple program to testify how C structures get passed to function. But the result was quite surpirsed. Some value was passed by register, but some value was passed by pushing them into stack. Here is the code.

源代码

#include <stdio.h>

typedef struct {
        int age;
        enum {Man, Woman} gen;
        double height;
        int class;
        char *name;
} student;

void print_student_info(student s) {
        printf("age: %d, gen: %s, height: %f, name: %s\n", 
                        s.age,
                        s.gen == Man? "Man":"Woman",
                        s.height, s.name);
}

int main() {
        student s;
        s.age = 10;
        s.gen = Man;
        s.height = 1.30;
        s.class = 3;
        s.name = "Tom";
        print_student_info(s);
        return 0;
}

asm

 6fa:   55                      push   %rbp
 6fb:   48 89 e5                mov    %rsp,%rbp
 6fe:   48 83 ec 20             sub    $0x20,%rsp
 702:   c7 45 e0 0a 00 00 00    movl   $0xa,-0x20(%rbp)
 709:   c7 45 e4 00 00 00 00    movl   $0x0,-0x1c(%rbp)
 710:   f2 0f 10 05 00 01 00    movsd  0x100(%rip),%xmm0        # 818 <_IO_stdin_used+0x48>
 717:   00 
 718:   f2 0f 11 45 e8          movsd  %xmm0,-0x18(%rbp)
 71d:   c7 45 f0 03 00 00 00    movl   $0x3,-0x10(%rbp)
 724:   48 8d 05 e5 00 00 00    lea    0xe5(%rip),%rax        # 810 <_IO_stdin_used+0x40>
 72b:   48 89 45 f8             mov    %rax,-0x8(%rbp)
 72f:   ff 75 f8                pushq  -0x8(%rbp)
 732:   ff 75 f0                pushq  -0x10(%rbp)
 735:   ff 75 e8                pushq  -0x18(%rbp)
 738:   ff 75 e0                pushq  -0x20(%rbp)
 73b:   e8 70 ff ff ff          callq  6b0 <print_student_info>
 740:   48 83 c4 20             add    $0x20,%rsp
 744:   b8 00 00 00 00          mov    $0x0,%eax
 749:   c9                      leaveq 
 74a:   c3                      retq   
 74b:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)   

我期望使用栈将结构传递给函数,但是上面的代码表明不是.

I expected structure was passed to function using the stack, but the code above showed it wasn't.

推荐答案

正如其他人指出的那样-按值传递结构通常在大多数情况下不被接受,但是 C 语言.我将讨论您使用过的代码,即使这不是我本来会做的.

As has been pointed out by others - passing structures by value is generally frowned upon in most cases, but it is allowable by the C language nonetheless. I'll discuss the code you did use even though it isn't how I would have done it.

如何传递结构取决于ABI/调用约定.今天有两种主要的64位ABI正在使用(可能还有其他). 64位Microsoft ABI x86-64系统V ABI . 64位Microsoft ABI很简单,因为按值传递的所有结构都在堆栈中.在x86-64 System V ABI中(由Linux/MacOS/BSD使用)更加复杂,因为存在一种递归算法,该递归算法用于确定是否可以在通用寄存器/向量寄存器/X87 FPU的组合中传递结构堆栈寄存器.如果它确定可以在寄存器中传递的结构,则该对象不会出于调用函数的目的而放在堆栈中.如果它不符合规则的寄存器要求,则将其传递到堆栈的内存中.

How structures are passed is dependent on the ABI / Calling convention. There are two primary 64-bit ABIs in use today (there may be others). The 64-bit Microsoft ABI and the x86-64 System V ABI. The 64-bit Microsoft ABI is simple as all structures passed by value are on the stack. In The x86-64 System V ABI (used by Linux/MacOS/BSD) is more complex as there is a recursive algorithm that is used to determine if a structure can be passed in a combination of general purpose registers / vector registers / X87 FPU stack registers. If it determines the structure can be passed in registers then the object isn't placed on the stack for the purpose of calling a function. If it doesn't fit in registers per the rules then it is passed in memory on the stack.

有一个迹象表明,您的代码未使用64位Microsoft ABI,因为在进行函数调用之前编译器未保留32字节的影子空间,因此几乎可以肯定这是针对x86-的编译器64系统V ABI.我可以使用在线Godbolt编译器和禁用了优化功能的GCC编译器,在您的问题中生成相同的汇编代码

There is a telltale sign that your code isn't using the 64-bit Microsoft ABI as 32 bytes of shadow space weren't reserved by the compiler before making the function call so this is almost certainly a compiler targeting the x86-64 System V ABI. I can generate the same assembly code in your question using the online godbolt compiler with the GCC compiler with optimizations disabled.

通过用于传递聚合的算法类型(例如结构和联合)超出了此答案的范围,但是您可以参考 3.2.3参数传递部分,但是我可以说此结构是在堆栈上传递的,因为的帖子清除规则,内容为:

Going through the algorithm for passing aggregate types (like structures and unions) is beyond the scope of this answer but you can refer to section 3.2.3 Parameter Passing, but I can say that this structure is passed on the stack because of a post cleanup rule that says:

如果聚合的大小超过2个八字节,而前8个字节不是SSE,或者其他8个字节不是SSEUP,则整个参数都将在内存中传递.

If the size of the aggregate exceeds two eightbytes and the first eightbyte isn’t SSE or any other eightbyte isn’t SSEUP, the whole argument is passed in memory.

碰巧您的结构会试图将前两个32位int值打包在64位寄存器中,并且将double放在向量寄存器中,然后再放置int在64位寄存器中(由于对齐规则),而指针在另一个64位寄存器中传递.您的结构将超过两个8字节(64位)寄存器,而第一个8字节(64位)寄存器不是SSE寄存器,因此该结构由编译器传递到堆栈上.

It happens to be that your structure would have attempted to have the first two 32-bit int values packed in a 64-bit register and the double placed in a vector register followed by the int being placed in a 64-bit register (because of alignment rules) and the pointer passed in another 64-bit register. Your structure would have exceeded two eightbyte (64-bit) registers and the first eightbyte (64-bit) register isn't an SSE register so the structure is passed on the stack by the compiler.

您有未优化的代码,但是我们可以将代码分解为大块.首先是构建堆栈框架并为局部变量分配空间.在未启用优化的情况下(这里就是这种情况),结构变量s将构建在堆栈上,然后将该结构的副本推入堆栈以调用print_student_info.

You have unoptimized code but we can break down the code into chunks. First is building the stack frame and allocating room for the local variable(s). Without optimizations enabled (which is the case here), the structure variable s will be built on the stack and then a copy of that structure will be pushed onto the stack to make the call to print_student_info.

这将构建堆栈帧并为局部变量分配32个字节(0x20)(并保持16个字节的对齐方式).在这种情况下,按照自然对齐规则,您的结构恰好是32个字节:

This builds the stackframe and allocates 32 bytes (0x20) for local variables (and maintains 16-byte alignment). Your structure happens to be exactly 32 bytes in size in this case following natural alignment rules:

 6fa:   55                      push   %rbp
 6fb:   48 89 e5                mov    %rsp,%rbp
 6fe:   48 83 ec 20             sub    $0x20,%rsp

您的变量s将在RBP-0x20处开始,并在RBP-0x01(含​​)处结束.该代码在堆栈上生成并初始化s变量(student结构). age字段的32位int 0xa(10)放在RBP-0x20结构的开头. Man的32位枚举放在RBP-0x1c的字段gen中:

Your variable s will start at RBP-0x20 and ends at RBP-0x01 (inclusive). The code builds and initializes the s variable (student struct) on the stack. A 32-bit int 0xa (10) for the age field is placed at the beginning of the structure at RBP-0x20. The 32-bit enum for Man is placed in field gen at RBP-0x1c:

 702:   c7 45 e0 0a 00 00 00    movl   $0xa,-0x20(%rbp)
 709:   c7 45 e4 00 00 00 00    movl   $0x0,-0x1c(%rbp)

编译器将常量1.30(类型double)存储在内存中.您无法在Intel x86处理器上使用一条指令在内存之间移动,因此编译器将双精度值1.30从内存位置RIP + 0x100移至矢量寄存器 XMM0 ,然后将<64 em> XMM0 到堆栈上位于RBP-0x18处的height字段:

The constant value 1.30 (type double) is stored in memory by the compiler. You can't move from memory to memory with one instruction on Intel x86 processors so the compiler moved the double value 1.30 from memory location RIP+0x100 to vector register XMM0 then moved the lower 64-bits of XMM0 to the height field on the stack at RBP-0x18:

 710:   f2 0f 10 05 00 01 00    movsd  0x100(%rip),%xmm0        # 818 <_IO_stdin_used+0x48>
 717:   00 
 718:   f2 0f 11 45 e8          movsd  %xmm0,-0x18(%rbp)

将值3放置在RBP-0x10处class字段的堆栈上:

The value 3 is placed on the stack for the class field at RBP-0x10:

 71d:   c7 45 f0 03 00 00 00    movl   $0x3,-0x10(%rbp)

最后,字符串Tom的64位地址(在程序的只读数据部分中)被加载到 RAX 中,然后最终移入堆栈的name字段中在RBP-0x08.尽管class的类型仅为32位(int类型),但它被填充为8个字节,因为以下字段name必须自然地在8字节边界上对齐,因为指针的大小为8字节

Lastly the 64-bit address of the string Tom (in the read only data section of the program) is loaded into RAX and then finally moved into the name field on the stack at RBP-0x08. Although the type for class was only 32-bits (an int type) it was padded to 8 bytes because the following field name has to be naturally aligned on an 8 byte boundary since a pointer is 8 bytes in size.

 724:   48 8d 05 e5 00 00 00    lea    0xe5(%rip),%rax        # 810 <_IO_stdin_used+0x40>
 72b:   48 89 45 f8             mov    %rax,-0x8(%rbp)

这时,我们有了一个完全建立在堆栈上的结构.然后,编译器通过将结构的所有32个字节(使用4个64位压入)压入堆栈以进行函数调用来复制它:

At this point we have a structure entirely built on the stack. The compiler then copies it by pushing all 32 bytes (using 4 64-bit pushes) of the structure onto the stack to make the function call:

 72f:   ff 75 f8                pushq  -0x8(%rbp)
 732:   ff 75 f0                pushq  -0x10(%rbp)
 735:   ff 75 e8                pushq  -0x18(%rbp)
 738:   ff 75 e0                pushq  -0x20(%rbp)
 73b:   e8 70 ff ff ff          callq  6b0 <print_student_info>

然后是典型的堆栈清理和功能结尾:

Then typical stack cleanup and function epilogue:

 740:   48 83 c4 20             add    $0x20,%rsp
 744:   b8 00 00 00 00          mov    $0x0,%eax
 749:   c9                      leaveq 

重要说明:在这种情况下,使用的寄存器并不是为了传递参数,而是初始化堆栈中s变量(结构)的代码的一部分.

Important Note: The registers used were not for the purpose of passing parameters in this case, but were part of the code that initialized the s variable (struct) on the stack.

这也取决于ABI,但是在这种情况下,我将重点介绍x86-64 System V ABI,因为那是您的代码所使用的.

This is dependent on the ABI as well, but I'll focus on the x86-64 System V ABI in this case since that is what your code is using.

通过引用:在 RAX 中返回了指向结构的指针.最好返回指向结构的指针.

By Reference: A pointer to a structure is returned in RAX. Returning pointers to structures is preferred.

按值:由值返回的 C 中的结构强制编译器为调用方中的返回结构分配附加空间,然后为该结构的地址分配空间在 RDI 中作为隐藏的第一个参数传递给函数.完成后,被调用的函数会将在 RDI 中作为参数传递的地址放入 RAX 中作为返回值.从函数返回后, RAX 中的值是指向存储返回结构的地址的指针,该地址始终是在隐藏的第一个参数 RDI 中传递的地址. ABI在返回值子标题下的 3.2.3参数传递部分中对此进行了讨论,该子标题为:

By value: A structure in C that is returned by value forces the compiler to allocate additional space for the return structure in the caller and then the address of that structure is passed as a hidden first parameter in RDI to the function. The called function will place the address that was passed in RDI as a parameter into RAX as the return value when it is finished. Upon return from the function the value in RAX is a pointer to the address where the return structure is stored which is always the same address passed in the hidden first parameter RDI. The ABI discusses this in section 3.2.3 Parameter Passing under the subheading Returning of Values which says:

  1. 如果类型具有MEMORY类,则调用方为返回提供空间 值,并以%rdi的形式传递此存储的地址,就好像它是第一个 该函数的参数.实际上,该地址成为隐藏"的第一个参数.此存储空间不得与被调用方通过以下方式可见的任何数据重叠 除此参数外的其他名称. 返回时,%rax将包含由 %rdi中的呼叫者.
  1. If the type has class MEMORY, then the caller provides space for the return value and passes the address of this storage in %rdi as if it were the first argument to the function. In effect, this address becomes a "hidden" first argument. This storage must not overlap any data visible to the callee through other names than this argument. On return %rax will contain the address that has been passed in by the caller in %rdi.

这篇关于如何将C结构传递给汇编中的函数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆