在没有原型的文件中调用的函数在ARM和x86-64上产生不同的结果 [英] Function called in a file without a prototype produce different results on ARM and x86-64

查看:85
本文介绍了在没有原型的文件中调用的函数在ARM和x86-64上产生不同的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有3个文件:main.clib.hlib.c:

We have 3 files: main.c, lib.h and lib.c:

main.c:

#include <stdio.h>
#include <stdlib.h>

/* #include "lib.h" */

int main(void)
{
    printf("sizeof unsigned long long: %zu\n", sizeof(unsigned long long));
    printf("sizeof int: %zu\n", sizeof(int));
    unsigned long long slot = 0;
    int pon_off = 1;
    lib_fn(slot, pon_off);
    return EXIT_SUCCESS;
}

lib.h:

void lib_fn(unsigned slot, int pon_off);

lib.c:

#include <stdio.h>
#include <stdlib.h>

void lib_fn(unsigned slot, int pon_off)
{
    printf("slot: %d\n", slot);
    printf("pon_off: %d\n", pon_off);
    return;
}

编译:

gcc -O2 -Wall -Wextra main.c lib.c

在ARM上运行:

$ ./a.out
sizeof unsigned long long: 8
sizeof int: 4
slot: 0
pon_off: 0

在x86-64上运行:

Run on x86-64:

$ ./a.out
sizeof unsigned long long: 8
sizeof int: 4
slot: 0
pon_off: 1

如您所见,pon_off在ARM上为0,而在x86-64上为1.我想它有 与参数大小有关,例如lib_fn()需要两个整数 在一起的长度是8个字节,而单个long long的长度是8个字节.

As you see pon_off is 0 on ARM but 1 on x86-64. I guess it has something to do with arguments size as lib_fn() takes two ints that are together 8 bytes long and a single long long is 8 bytes long.

  1. 为什么pon_off在ARM和x86-64上的打印方式不同?

  1. Why is pon_off printed differently on ARM and x86-64?

它与调用约定有关吗?

推荐答案

它与调用约定有关吗?

Does it have something to do with a calling convention?

是的,它与呼叫约定/ABI有一切.

Yes, it has everything to do with the calling convention / ABI.

在x86-64上,函数参数的自然"宽度为64位,并且较窄的整数args仍使用整个"slot". (第一个 6整数/指针args和寄存器中的前8个FP args(SysV)或前4个args(Windows),然后是堆栈).

On x86-64, the "natural" width of a function argument is 64 bits, and narrower integer args still use a whole "slot". (First 6 integer/pointer args and first 8 FP args in registers (SysV) or first 4 args (Windows), then stack).

在ARM上,寄存器宽度(和堆栈上的"arg插槽"最小宽度)为32位,而64位整数args占用两个寄存器.

On ARM, the register width (and "arg slot" minimum width on the stack) is 32 bits, and 64-bit integer args take two registers.

在32位x86(gcc -m32)上,您将看到与32位ARM相同的行为.在AArch64上,您会看到与x86-64相同的行为,因为它们的调用约定都是正常"的,并且不会将单独的窄arg打包到单个寄存器中. ( x86-64系统V 会将结构成员最多打包为2个但是,不要为每个成员使用单独的寄存器!)

On 32-bit x86 (gcc -m32) you would see the same behaviour as 32-bit ARM. On AArch64, you would see the same behaviour as x86-64, because their calling conventions are all "normal" and don't pack separate narrow args into single registers. (x86-64 System V does pack struct members into up to 2 registers, though, instead of using a separate register per member!)

无论args是在寄存器中传递还是在堆栈中传递,具有等于寄存器大小的最小"arg插槽"宽度几乎都是通用的.不过,这不一定是int的宽度: AVR(8位RISC微控制器)具有16位的int,它需要两个寄存器,但是char/uint8_t args可以在单个寄存器中传递.

Having a minimum "arg slot" width that's equal to the register size is nearly universal, whether args are passed in registers or on the stack. This isn't necessarily the width of int, though: AVR (8-bit RISC microcontroller) has 16-bit int which takes two registers, but char / uint8_t args can be passed in a single register.

对于原型,根据原型中的类型,较宽/较窄的类型将转换为被调用者期望的类型.所以显然一切正常.

With a prototype, wider/narrower types are converted to what the callee expects, according to the types in the prototype. So obviously everything works.

没有原型,调用中的表达式类型将确定arg的传递方式. unsigned long long slot采用ARM调用约定中的前2个arg传递寄存器,其中lib_fn希望找到其2个整数args.

Without a prototype, the type of the expression in the call determines how the arg is passed. unsigned long long slot takes the first 2 arg-passing registers in ARM's calling convention, where lib_fn expects to find its 2 integer args.

(声称没有原型的所有内容都转换为int的答案是错误的.没有原型等效于int lib_fn(...);,但是printf仍然适用于doubleint64_t.请注意,float是传递给可变参数函数时隐式转换为double,就像较窄的整数类型上转换为int一样,这就是为什么%fdouble的格式,而float没有格式的原因,与scanf中传递指针的方式不同.这只是C的设计方式;没有理由.但是无论如何,C要求能够将可变类型照原样传递给可变参数,并且所有调用约定都可以适应这种情况.)

(The answer claiming everything is converted to int without a prototype is wrong. No prototype is equivalent to int lib_fn(...);, but printf still works with double and int64_t. Note that float is implicitly converted to double when passing to a variadic function, just like narrower integer types are up-converted to int, which is why %f is the format for double, and there is no format for float, unlike with scanf where you pass pointers. That's just how C is designed; there's no reason for it. But anyway, C requires that wider types are able to be passed as is to variadic functions, and all calling conventions accomodate that.)

顺便说一句,可能会造成其他破坏:某些实现对可变参数(因此没有原型)的调用约定与普通函数使用不同的调用约定.

BTW, other breakage is possible: Some implementations use a different calling convention for variadic (and thus unprototyped) functions than for normal functions.

例如,在Windows上,您可以将某些编译器设置为默认为 _stdcall调用约定,被调用方从堆栈中弹出args. (即,在弹出返回地址后使用ret 8进行esp+=8.)但是显然,此调用约定不适用于可变函数,因此默认设置不适用于它们,它们将使用_cdecl或调用者负责清理堆栈args的事情,因为只有调用者才能确定他们传递了多少args.希望在这种模式下,编译器至少会对隐式声明的函数发出警告(如果不是错误的话),因为弄错了会导致崩溃(调用后堆栈指向错误的位置).

For example, on Windows you can set some compilers to default to the _stdcall calling convention, where the callee pops the args from the stack. (i.e. with a ret 8 to do esp+=8 after popping the return address.) But obviously this calling convention isn't usable for variadic functions, so the default doesn't apply to them, and they would use _cdecl or something where the caller is responsible for cleaning up stack args, because only the caller knows for sure how many args they passed. Hopefully in this mode compilers would at least warn if not error for implicitly declared functions, because getting it wrong leads to a crash (stack pointing to the wrong place after a call).

有关读取编译器asm输出的介绍,请参见如何去除噪音" ,尤其是Matt Godbolt在CppCon2017上的演讲最近我的编译器对我做了什么?取消编译器的盖子"..

For an introduction to reading compiler asm output, see How to remove "noise" from GCC/clang assembly output?, and especially Matt Godbolt's CppCon2017 talk "What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid".

为了使asm尽可能简单,我删除了打印内容,并将代码放入了返回void的函数中. (这允许尾部调用优化,您可以在其中跳转到该函数并 it 返回给调用者.)编译器输出中唯一的指令是arg设置并跳转到lib_fn.

To make the asm as simple as possible, I removed the printing and put the code in a function that returns void. (This allows tail-call optimization where you jump to the function and it returns to your caller.) The only instructions in the compiler output are the arg setup and jumping to lib_fn.

#ifdef USE_PROTO
void lib_fn(unsigned slot, int pon_off);
#endif

void foo(void) {
    unsigned long long slot = 0;
    int pon_off = 1;
    lib_fn(slot, pon_off);
}

在带有gcc 6.3的ARM,x86-64和x86-32(-m32)上,参见Godbolt编译器资源管理器上的source + asm . (我实际上复制了foo并重命名了lib_fn,因此在一个版本的调用程序中它没有原型,而不是每个体系结构都有2个单独的编译器窗口.在更复杂的情况下,这很方便,因为您可以区分在编译器窗格之间).

See the source+asm on the Godbolt compiler explorer, for ARM, x86-64, and x86-32 (-m32) with gcc 6.3. (I actually copied foo and renamed lib_fn so it would have no prototype in one version of the caller, instead of having 2 separate compiler windows for each architecture. In a more complex case, that would be handy because you can diff between compiler panes).

对于x86-64,无论有没有原型,输出基本上是相同的.如果没有,调用者必须将al(

For x86-64, the output is basically the same with/without the prototype. Without, the caller has to zero al (using xor eax,eax to zero the whole RAX) to indicate that this variadic function call is passing no FP args in XMM registers. (In the Windows calling convention, you wouldn't have that because the Windows convention is optimized for variadic functions and simplicity of implementing them at the expense of normal functions.)

对于ARM:

foo:                  @ no prototype
    mov     r2, #1    @ pon_off
    mov     r0, #0    @ slot low half
    mov     r1, #0    @ slot high half
    b       lib_fn_noproto


bar:                  @ with proto, u long long is converted to unsigned according to C rules, like the callee expects
    mov     r1, #1
    mov     r0, #0
    b       lib_fn

lib_fn期望R0中的slot和R1中的pon_off.

lib_fn is expecting slot in R0 and pon_off in R1.

如果使用unsigned __int128,则在x86-64上也会遇到相同的问题.

You'd have the same problem on x86-64 if you used unsigned __int128.

lib_fn_noproto((unsigned __int128)slot, pon_off);

编译为:

    mov     edx, 1          # pon_off = EDX = 1
    xor     edi, edi        #  low half of slot = RDI = 0
    xor     esi, esi        # high half of slot = RSI = 0
    xor     eax, eax        # number of xmm register args = 0
    jmp     lib_fn_noproto

完全打破了 的调用约定,与使用64位arg占用前2个插槽的32位ARM相同.

which breaks the calling convention in exactly the same way as for 32-bit ARM with a 64-bit arg taking the first 2 slots.

这篇关于在没有原型的文件中调用的函数在ARM和x86-64上产生不同的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆