在没有原型的文件中调用的函数在 ARM 和 x86-64 上产生不同的结果 [英] Function called in a file without a prototype produce different results on ARM and x86-64

查看:29
本文介绍了在没有原型的文件中调用的函数在 ARM 和 x86-64 上产生不同的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有 3 个文件:main.clib.hlib.c:

We have 3 files: main.c, lib.h and lib.c:

main.c:

#include <stdio.h>
#include <stdlib.h>

/* #include "lib.h" */

int main(void)
{
    printf("sizeof unsigned long long: %zu\n", sizeof(unsigned long long));
    printf("sizeof int: %zu\n", sizeof(int));
    unsigned long long slot = 0;
    int pon_off = 1;
    lib_fn(slot, pon_off);
    return EXIT_SUCCESS;
}

lib.h:

void lib_fn(unsigned slot, int pon_off);

lib.c:

#include <stdio.h>
#include <stdlib.h>

void lib_fn(unsigned slot, int pon_off)
{
    printf("slot: %d\n", slot);
    printf("pon_off: %d\n", pon_off);
    return;
}

编译:

gcc -O2 -Wall -Wextra main.c lib.c

在 ARM 上运行:

$ ./a.out
sizeof unsigned long long: 8
sizeof int: 4
slot: 0
pon_off: 0

在 x86-64 上运行:

Run on x86-64:

$ ./a.out
sizeof unsigned long long: 8
sizeof int: 4
slot: 0
pon_off: 1

如您所见,pon_off 在 ARM 上为 0,但在 x86-64 上为 1.我猜它有与参数大小有关,因为 lib_fn() 需要两个整数一起是 8 字节长,单个 long long 是 8 字节长.

As you see pon_off is 0 on ARM but 1 on x86-64. I guess it has something to do with arguments size as lib_fn() takes two ints that are together 8 bytes long and a single long long is 8 bytes long.

  1. 为什么 pon_off 在 ARM 和 x86-64 上的打印方式不同?

  1. Why is pon_off printed differently on ARM and x86-64?

它是否与调用约定有关?

Does it have something to do with a calling convention?

推荐答案

它是否与调用约定有关?

Does it have something to do with a calling convention?

是的,它一切都与调用约定/ABI有关.

Yes, it has everything to do with the calling convention / ABI.

在 x86-64 上,函数参数的自然"宽度是 64 位,更窄的整数参数仍然使用整个槽".(首先6个整数/指针args 和寄存器中的前 8 个 FP args (SysV) 或前 4 个 args (Windows),然后是堆栈.

On x86-64, the "natural" width of a function argument is 64 bits, and narrower integer args still use a whole "slot". (First 6 integer/pointer args and first 8 FP args in registers (SysV) or first 4 args (Windows), then stack).

在 ARM 上,寄存器宽度(和栈上的arg slot"最小宽度)为 32 位,64 位整数 args 占用两个寄存器.

On ARM, the register width (and "arg slot" minimum width on the stack) is 32 bits, and 64-bit integer args take two registers.

在 32 位 x86 (gcc -m32) 上,您会看到与 32 位 ARM 相同的行为.在 AArch64 上,您会看到与 x86-64 相同的行为,因为它们的调用约定都是正常的",并且不会将单独的窄参数打包到单个寄存器中.(x86-64 System V 确实将结构成员打包成最多 2 个注册,而不是为每个成员使用单独的注册!)

On 32-bit x86 (gcc -m32) you would see the same behaviour as 32-bit ARM. On AArch64, you would see the same behaviour as x86-64, because their calling conventions are all "normal" and don't pack separate narrow args into single registers. (x86-64 System V does pack struct members into up to 2 registers, though, instead of using a separate register per member!)

具有等于寄存器大小的最小arg slot"宽度几乎是通用的,无论 args 是在寄存器中传递还是在堆栈上传递.不过,这不一定是 int 的宽度:AVR(8 位 RISC 微控制器)有 16 位 int 需要两个寄存器,但 char/uint8_t 参数可以在单个寄存器中传递.

Having a minimum "arg slot" width that's equal to the register size is nearly universal, whether args are passed in registers or on the stack. This isn't necessarily the width of int, though: AVR (8-bit RISC microcontroller) has 16-bit int which takes two registers, but char / uint8_t args can be passed in a single register.

使用原型,根据原型中的类型,将更宽/更窄的类型转换为被调用者期望的类型.很明显,一切正常.

With a prototype, wider/narrower types are converted to what the callee expects, according to the types in the prototype. So obviously everything works.

在没有原型的情况下,调用中的表达式类型决定了 arg 的传递方式.unsigned long long slot 采用 ARM 调用约定中的前 2 个传递参数的寄存器,其中 lib_fn 期望找到它的 2 个整数参数.

Without a prototype, the type of the expression in the call determines how the arg is passed. unsigned long long slot takes the first 2 arg-passing registers in ARM's calling convention, where lib_fn expects to find its 2 integer args.

(声称一切都被转换为 int 而没有原型的答案是错误的.没有原型等同于 int lib_fn(...);,但是 printf 仍然适用于 doubleint64_t.注意 float 在传递时隐式转换为 double到可变参数函数,就像较窄的整数类型被向上转换为 int,这就是为什么 %fdouble 的格式,并且float 没有格式,与传递指针的 scanf 不同.这就是 C 的设计方式;没有理由这样做.但无论如何,C 要求能够按原样传递更广泛的类型到可变参数函数,并且所有调用约定都适用.)

(The answer claiming everything is converted to int without a prototype is wrong. No prototype is equivalent to int lib_fn(...);, but printf still works with double and int64_t. Note that float is implicitly converted to double when passing to a variadic function, just like narrower integer types are up-converted to int, which is why %f is the format for double, and there is no format for float, unlike with scanf where you pass pointers. That's just how C is designed; there's no reason for it. But anyway, C requires that wider types are able to be passed as is to variadic functions, and all calling conventions accomodate that.)

顺便说一句,其他破坏是可能的:一些实现对可变参数(因此是非原型)函数使用与普通函数不同的调用约定.

BTW, other breakage is possible: Some implementations use a different calling convention for variadic (and thus unprototyped) functions than for normal functions.

例如,在 Windows 上,您可以将一些编译器设置为默认为_stdcall 调用约定,其中被调用者从堆栈中弹出 args.(即在弹出返回地址后使用 ret 8 来做 esp+=8.)但显然这个调用约定不适用于可变参数函数,所以默认值不t 适用于它们,它们会使用 _cdecl 或调用者负责清理堆栈参数的东西,因为只有调用者知道他们传递了多少参数.希望在这种模式下编译器至少会警告隐式声明的函数如果没有错误,因为错误会导致崩溃(调用后堆栈指向错误的位置).

For example, on Windows you can set some compilers to default to the _stdcall calling convention, where the callee pops the args from the stack. (i.e. with a ret 8 to do esp+=8 after popping the return address.) But obviously this calling convention isn't usable for variadic functions, so the default doesn't apply to them, and they would use _cdecl or something where the caller is responsible for cleaning up stack args, because only the caller knows for sure how many args they passed. Hopefully in this mode compilers would at least warn if not error for implicitly declared functions, because getting it wrong leads to a crash (stack pointing to the wrong place after a call).

有关读取编译器 asm 输出的介绍,请参阅 How去除噪音"来自 GCC/clang 汇编输出?,尤其是 Matt Godbolt 的 CppCon2017 演讲 我的编译器最近为我做了什么?打开编译器的盖子".

For an introduction to reading compiler asm output, see How to remove "noise" from GCC/clang assembly output?, and especially Matt Godbolt's CppCon2017 talk "What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid".

为了使 asm 尽可能简单,我删除了打印并将代码放在返回 void 的函数中.(这允许尾调用优化,您可以在其中跳转到函数并 返回给您的调用者.)编译器输出中的唯一指令是 arg 设置和跳转到 lib_fn.

To make the asm as simple as possible, I removed the printing and put the code in a function that returns void. (This allows tail-call optimization where you jump to the function and it returns to your caller.) The only instructions in the compiler output are the arg setup and jumping to lib_fn.

#ifdef USE_PROTO
void lib_fn(unsigned slot, int pon_off);
#endif

void foo(void) {
    unsigned long long slot = 0;
    int pon_off = 1;
    lib_fn(slot, pon_off);
}

在 Godbolt 编译器资源管理器上查看 source+asm,适用于 ARM、x86-64 和 x86-32 (-m32) 和 gcc 6.3.(我实际上复制了 foo 并重命名了 lib_fn 所以它在一个版本的调用者中没有原型,而不是每个架构有 2 个单独的编译器窗口.在一个更复杂的情况下,这会很方便,因为您可以在编译器窗格之间进行区分).

See the source+asm on the Godbolt compiler explorer, for ARM, x86-64, and x86-32 (-m32) with gcc 6.3. (I actually copied foo and renamed lib_fn so it would have no prototype in one version of the caller, instead of having 2 separate compiler windows for each architecture. In a more complex case, that would be handy because you can diff between compiler panes).

对于 x86-64,有/没有原型的输出基本相同.如果没有,调用者必须将 al 归零(使用 xor eax,eax 将整个 RAX 归零) 表示这个可变参数函数调用是在 XMM 寄存器中不传递 FP 参数.(在 Windows 调用约定中,您不会有这种情况,因为 Windows 约定针对可变参数函数进行了优化,并且以牺牲普通函数为代价实现了它们的简单性.)

For x86-64, the output is basically the same with/without the prototype. Without, the caller has to zero al (using xor eax,eax to zero the whole RAX) to indicate that this variadic function call is passing no FP args in XMM registers. (In the Windows calling convention, you wouldn't have that because the Windows convention is optimized for variadic functions and simplicity of implementing them at the expense of normal functions.)

对于 ARM:

foo:                  @ no prototype
    mov     r2, #1    @ pon_off
    mov     r0, #0    @ slot low half
    mov     r1, #0    @ slot high half
    b       lib_fn_noproto


bar:                  @ with proto, u long long is converted to unsigned according to C rules, like the callee expects
    mov     r1, #1
    mov     r0, #0
    b       lib_fn

lib_fn 需要 R0 中的 slot 和 R1 中的 pon_off.

lib_fn is expecting slot in R0 and pon_off in R1.

如果您使用 unsigned __int128,您会在 x86-64 上遇到同样的问题.

You'd have the same problem on x86-64 if you used unsigned __int128.

lib_fn_noproto((unsigned __int128)slot, pon_off);

编译为:

    mov     edx, 1          # pon_off = EDX = 1
    xor     edi, edi        #  low half of slot = RDI = 0
    xor     esi, esi        # high half of slot = RSI = 0
    xor     eax, eax        # number of xmm register args = 0
    jmp     lib_fn_noproto

这打破了中的调用约定,这与 32 位 ARM 的调用约定相同,其中 64 位 arg 占据前 2 个插槽.

which breaks the calling convention in exactly the same way as for 32-bit ARM with a 64-bit arg taking the first 2 slots.

这篇关于在没有原型的文件中调用的函数在 ARM 和 x86-64 上产生不同的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆