UNIX & 的调用约定是什么?i386 和 x86-64 上的 Linux 系统调用(和用户空间函数) [英] What are the calling conventions for UNIX & Linux system calls (and user-space functions) on i386 and x86-64

查看:24
本文介绍了UNIX & 的调用约定是什么?i386 和 x86-64 上的 Linux 系统调用(和用户空间函数)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下链接解释了 UNIX(BSD 风格)和Linux:

但是 UNIX 和 UNIX 上的 x86-64 系统调用约定是什么?Linux?

解决方案

进一步阅读此处的任何主题:Linux 系统调用权威指南


我在 Linux 上使用 GNU Assembler (gas) 验证了这些.

内核接口

x86-32 又名 i386 Linux 系统调用约定:

在 x86-32 中,Linux 系统调用的参数使用寄存器传递.%eax 用于 syscall_number.%ebx, %ecx, %edx, %esi, %edi, %ebp 用于传递6个参数给系统调用.

返回值在 %eax 中.所有其他寄存器(包括 EFLAGS)都保留在 int $0x80 中.

我从 Linux 汇编教程 但我对此表示怀疑.如果有人能举个例子,那就太好了.

<块引用>

如果有六个以上的参数,%ebx 必须包含内存参数列表的位置已存储 - 但不要担心这个因为你不太可能使用超过六个的系统调用参数.

有关示例和更多阅读内容,请参阅 http://www.int80h.org/bsdasm/#alternate-calling-convention.另一个使用 int 0x80 的 i386 Linux Hello World 示例:你好,有 Linux 系统调用的汇编语言世界吗?

有一种更快的方法来进行 32 位系统调用:使用 sysenter.内核将一页内存映射到每个进程(vDSO)中,用户空间侧sysenter舞蹈,它必须与内核合作才能找到返回地址.注册映射的参数与 int $0x80 相同.您通常应该调用 vDSO 而不是直接使用 sysenter.(参见 Linux 系统调用权威指南,了解有关链接和调用 vDSO 的信息,以及有关 sysenter 的更多信息,以及与系统调用有关的所有其他信息.)>

x86-32 [Free|Open|Net|DragonFly]BSD UNIX 系统调用约定:

参数在堆栈上传递.将参数(最先推送的最后一个参数)推入堆栈.然后再推送一个额外的 32 位虚拟数据(它实际上不是虚拟数据.请参阅以下链接了解更多信息),然后给出系统调用指令 int $0x80

http://www.int80h.org/bsdasm/#default-calling-约定


x86-64 Linux 系统调用约定:

(注意:x86-64 Mac OS X 相似但不同a> 来自 Linux.TODO:检查 *BSD 做了什么)

请参阅部分:A.2 AMD64 Linux 内核约定"System V 应用二进制接口 AMD64 架构处理器补充.最新版本的 i386 和 x86-64 System V psABI 可以在 链接中找到从 ABI 维护者的 repo 中的这个页面.(另请参阅 标签 wiki,了解最新的-date ABI 链接和许多其他关于 x86 asm 的好东西.)

这是本节的片段:

<块引用>

  1. 用户级应用程序用作整数寄存器来传递序列 %rdi, %rsi, %rdx, %rcx,%r8 和 %r9.内核接口使用 %rdi、%rsi、%rdx、%r10、%r8 和 %r9.
  2. 系统调用是通过syscall 指令完成的.这 clobbers%rcx 和 %r11 以及 %rax 返回值,但其他寄存器被保留.
  3. 必须在寄存器 %rax 中传递系统调用的编号.
  4. 系统调用限制为六个参数,不传递任何参数直接在堆栈上.
  5. 从系统调用返回,寄存器 %rax 包含以下结果系统调用.-4095 和 -1 之间范围内的值表示一个错误,它是-errno.
  6. 仅将 INTEGER 类或 MEMORY 类的值传递给内核.

请记住,这是来自 ABI 的 Linux 特定附录,即使对于 Linux,它也是信息性而非规范性的.(但实际上它是准确的.)

此 32 位 int $0x80 ABI 可用于 64 位代码(但强烈不推荐).如果在 64 位代码中使用 32 位 int 0x80 Linux ABI,会发生什么情况? 它仍然将其输入截断为 32 位,因此它不适合指针,并且将 r8-r11 归零.

用户界面:函数调用

x86-32 函数调用约定:

在 x86-32 中,参数在堆栈上传递.最后一个参数首先被压入堆栈,直到所有参数都完成,然后执行call指令.这用于从汇编调用 Linux 上的 C 库 (libc) 函数.

现代版本的 i386 System V ABI(在 Linux 上使用)需要在 call 之前对 %esp 进行 16 字节对齐,例如 x86-64 System V ABI一直需要.允许被调用者假设并使用 SSE 16 字节加载/存储在未对齐时出错.但在历史上,Linux 只需要 4 字节的堆栈对齐,因此即使是 8 字节的 double 或其他东西,也需要额外的工作来保留自然对齐的空间.

其他一些现代 32 位系统仍然不需要超过 4 字节的堆栈对齐.


x86-64 System V 用户空间函数调用约定:

x86-64 System V 在寄存器中传递 args,这比 i386 System V 的堆栈 args 约定更有效.它避免了将 args 存储到内存(缓存)然后在被调用者中再次加载它们的延迟和额外指令.这很有效,因为有更多可用的寄存器,并且更适合延迟和乱序执行很重要的现代高性能 CPU.(i386 ABI 很旧了).

在这个机制中:首先,参数被分成几类.每个参数的类决定了它传递给被调用函数的方式.

有关完整信息,请参阅:3.2 函数调用序列";System V Application Binary Interface AMD64 Architecture Processor Supplement 部分内容:

<块引用>

一旦参数被分类,寄存器就会被分配(在从左到右的顺序)传递如下:

  1. 如果类是 MEMORY,则在堆栈上传递参数.
  2. 如果类是 INTEGER,则该类的下一个可用寄存器使用序列 %rdi、%rsi、%rdx、%rcx、%r8 和 %r9

所以 %rdi, %rsi, %rdx, %rcx, %r8 和 %r9 是寄存器按顺序用于传递整数/指针(即 INTEGER 类) 参数到汇编中的任何 libc 函数.%rdi 用于第一个 INTEGER 参数.%rsi 表示第二个,%rdx 表示第三个,依此类推.然后应该给出call指令.call 执行时,堆栈 (%rsp) 必须是 16B 对齐的.

如果有超过 6 个 INTEGER 参数,则第 7 个 INTEGER 参数及之后的参数会在堆栈上传递.(来电弹出,与 x86-32 相同.)

前 8 个浮点参数在 %xmm0-7 中传递,稍后在堆栈中.没有调用保留的向量寄存器.(混合了 FP 和整数参数的函数可以有 8 个以上的寄存器参数.)

可变参数函数(printf) 总是需要 %al = FP 寄存器参数的数量.

何时将结构打包到寄存器(返回时rdx:rax)与内存中是有规则的.有关详细信息,请参阅 ABI,并检查编译器输出,以确保您的代码与编译器就应如何传递/返回某些内容一致.


请注意 Windows x64 功能调用约定与 x86-64 System V 有多个显着差异,例如必须由调用者保留的阴影空间(而不是红色区域),以及调用保留的 xmm6-xmm15.对于哪个 arg 进入哪个寄存器的非常不同的规则.

Following links explain x86-32 system call conventions for both UNIX (BSD flavor) & Linux:

But what are the x86-64 system call conventions on both UNIX & Linux?

解决方案

Further reading for any of the topics here: The Definitive Guide to Linux System Calls


I verified these using GNU Assembler (gas) on Linux.

Kernel Interface

x86-32 aka i386 Linux System Call convention:

In x86-32 parameters for Linux system call are passed using registers. %eax for syscall_number. %ebx, %ecx, %edx, %esi, %edi, %ebp are used for passing 6 parameters to system calls.

The return value is in %eax. All other registers (including EFLAGS) are preserved across the int $0x80.

I took following snippet from the Linux Assembly Tutorial but I'm doubtful about this. If any one can show an example, it would be great.

If there are more than six arguments, %ebx must contain the memory location where the list of arguments is stored - but don't worry about this because it's unlikely that you'll use a syscall with more than six arguments.

For an example and a little more reading, refer to http://www.int80h.org/bsdasm/#alternate-calling-convention. Another example of a Hello World for i386 Linux using int 0x80: Hello, world in assembly language with Linux system calls?

There is a faster way to make 32-bit system calls: using sysenter. The kernel maps a page of memory into every process (the vDSO), with the user-space side of the sysenter dance, which has to cooperate with the kernel for it to be able to find the return address. Arg to register mapping is the same as for int $0x80. You should normally call into the vDSO instead of using sysenter directly. (See The Definitive Guide to Linux System Calls for info on linking and calling into the vDSO, and for more info on sysenter, and everything else to do with system calls.)

x86-32 [Free|Open|Net|DragonFly]BSD UNIX System Call convention:

Parameters are passed on the stack. Push the parameters (last parameter pushed first) on to the stack. Then push an additional 32-bit of dummy data (Its not actually dummy data. refer to following link for more info) and then give a system call instruction int $0x80

http://www.int80h.org/bsdasm/#default-calling-convention


x86-64 Linux System Call convention:

(Note: x86-64 Mac OS X is similar but different from Linux. TODO: check what *BSD does)

Refer to section: "A.2 AMD64 Linux Kernel Conventions" of System V Application Binary Interface AMD64 Architecture Processor Supplement. The latest versions of the i386 and x86-64 System V psABIs can be found linked from this page in the ABI maintainer's repo. (See also the tag wiki for up-to-date ABI links and lots of other good stuff about x86 asm.)

Here is the snippet from this section:

  1. User-level applications use as integer registers for passing the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9. The kernel interface uses %rdi, %rsi, %rdx, %r10, %r8 and %r9.
  2. A system-call is done via the syscall instruction. This clobbers %rcx and %r11 as well as the %rax return value, but other registers are preserved.
  3. The number of the syscall has to be passed in register %rax.
  4. System-calls are limited to six arguments, no argument is passed directly on the stack.
  5. Returning from the syscall, register %rax contains the result of the system-call. A value in the range between -4095 and -1 indicates an error, it is -errno.
  6. Only values of class INTEGER or class MEMORY are passed to the kernel.

Remember this is from the Linux-specific appendix to the ABI, and even for Linux it's informative not normative. (But it is in fact accurate.)

This 32-bit int $0x80 ABI is usable in 64-bit code (but highly not recommended). What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code? It still truncates its inputs to 32-bit, so it's unsuitable for pointers, and it zeros r8-r11.

User Interface: function calling

x86-32 Function Calling convention:

In x86-32 parameters were passed on stack. Last parameter was pushed first on to the stack until all parameters are done and then call instruction was executed. This is used for calling C library (libc) functions on Linux from assembly.

Modern versions of the i386 System V ABI (used on Linux) require 16-byte alignment of %esp before a call, like the x86-64 System V ABI has always required. Callees are allowed to assume that and use SSE 16-byte loads/stores that fault on unaligned. But historically, Linux only required 4-byte stack alignment, so it took extra work to reserve naturally-aligned space even for an 8-byte double or something.

Some other modern 32-bit systems still don't require more than 4 byte stack alignment.


x86-64 System V user-space Function Calling convention:

x86-64 System V passes args in registers, which is more efficient than i386 System V's stack args convention. It avoids the latency and extra instructions of storing args to memory (cache) and then loading them back again in the callee. This works well because there are more registers available, and is better for modern high-performance CPUs where latency and out-of-order execution matter. (The i386 ABI is very old).

In this new mechanism: First the parameters are divided into classes. The class of each parameter determines the manner in which it is passed to the called function.

For complete information refer to : "3.2 Function Calling Sequence" of System V Application Binary Interface AMD64 Architecture Processor Supplement which reads, in part:

Once arguments are classified, the registers get assigned (in left-to-right order) for passing as follows:

  1. If the class is MEMORY, pass the argument on the stack.
  2. If the class is INTEGER, the next available register of the sequence %rdi, %rsi, %rdx, %rcx, %r8 and %r9 is used

So %rdi, %rsi, %rdx, %rcx, %r8 and %r9 are the registers in order used to pass integer/pointer (i.e. INTEGER class) parameters to any libc function from assembly. %rdi is used for the first INTEGER parameter. %rsi for 2nd, %rdx for 3rd and so on. Then call instruction should be given. The stack (%rsp) must be 16B-aligned when call executes.

If there are more than 6 INTEGER parameters, the 7th INTEGER parameter and later are passed on the stack. (Caller pops, same as x86-32.)

The first 8 floating point args are passed in %xmm0-7, later on the stack. There are no call-preserved vector registers. (A function with a mix of FP and integer arguments can have more than 8 total register arguments.)

Variadic functions (like printf) always need %al = the number of FP register args.

There are rules for when to pack structs into registers (rdx:rax on return) vs. in memory. See the ABI for details, and check compiler output to make sure your code agrees with compilers about how something should be passed/returned.


Note that the Windows x64 function calling convention has multiple significant differences from x86-64 System V, like shadow space that must be reserved by the caller (instead of a red-zone), and call-preserved xmm6-xmm15. And very different rules for which arg goes in which register.

这篇关于UNIX &amp; 的调用约定是什么?i386 和 x86-64 上的 Linux 系统调用(和用户空间函数)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆