为什么整数除-1(负一)导致FPE? [英] Why does integer division by -1 (negative one) result in FPE?

查看:316
本文介绍了为什么整数除-1(负一)导致FPE?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个扩展C代码(在x86上运行)的一些奇怪行为的任务。


代码片段1输出 -2147483648

code>

  int a = 0x80000000; 
int b = a / -1;
printf(%d\\\
,b);




< blockquote>

代码片段2不输出任何内容,并给出浮点异常

  int a = 0x80000000; 
int b = -1;
int c = a / b;
printf(%d \\\
,c);


我很清楚Code Snippet 1( 1 +〜INT_MIN == INT_MIN ),但我不太明白整数除-1是如何产生FPE的,我也不能在我的Android手机上重现它(AArch64 ,GCC 7.2.0)。代码2只是输出与代码1相同,没有任何例外。它是x86处理器的一个隐藏的 bug 功能吗?



该赋值没有告诉其他任何东西(包括CPU架构),但是由于整个当然是基于桌面Linux发行版,你可以放心地认为它是一个现代x86。




编辑 strong>:我联系了我的朋友,他在Ubuntu 16.04上测试了代码(Intel Kaby Lake,GCC 6.3.0)。结果与任何指定的内容一致(代码1输出上述内容,代码2与FPE一起崩溃)。 这里有四件事情:


  • gcc -O0 行为解释了你的两个版本。 (虽然 clang -O0 恰好用 idiv )编译它们。甚至为什么你用编译时常量操作数得到这个结果。 > idiv 错误行为与ARM上除法指令的行为

  • 如果整数运算结果为一个正在交付的信号,POSIX要求它是SIGFPE:但POSIX 不需要捕获任何特定的整数操作。 (这就是为什么它允许x86和ARM有所不同)。



    单一Unix规范将SIGFPE 定义为错误的算术运算。它是以浮点名称命名的,但在FPU处于默认状态的正常系统中,只有整数数学会提高它。在x86上,只有整数除法。在MIPS上,编译器可以使用 add >来代替的 addu 用于有符号数学运算,因此您可以获得有符号添加溢出的陷阱。 ((g:)((h:codeEditor,i:(j:1,source:'int + add(())( gcc uses addu 甚至对于签名,但未定义行为的检测器可能会使用 add 。)


  • C未定义的行为规则(特殊签名溢出和除法),它允许gcc发出可以陷入该情况的代码。






没有选项的gcc与 gcc -O0 相同。


-O0
缩短编译时间,使调试产生预期结果。这是默认值。


这解释了两个版本之间的区别:



不仅 gcc -O0 没有尝试优化,它主动地去优化以使asm在函数内独立地实现每个C语句。这允许 gdb 跳转命令可以安全地工作,让您跳转到函数内的另一行,并像您真的在跳跃在C语言源代码中。



它也不能假设语句之间的变量值,因为您可以使用 set b = 4 。这对性能来说显然是灾难性的,这就是为什么 -O0 代码比普通代码运行慢几倍,为什么 -O0 优化特别是总废话。它还使 -O0 asm输出
<-assembly-output>非常嘈杂且难以让人阅读
。 pre> int a = 0x80000000;
int b = -1;
// //调试器可以在断点停止并修改b。
int c = a / b; // a和b必须被视为运行时变量,而不是常量。
printf(%d \\\
,c);

我把你的代码放在函数上的Godbolt 编译器资源管理器 来获取这些语句的asm。 p>

要评估 a / b gcc -O0 必须发出代码从内存中重新加载 a b ,并且不对其值进行任何假设。



但是 int c = a / -1; ,您不能更改 -1 带一个调试器,所以gcc可以执行该语句,就像执行 int c = -a; ,其中x86 neg eax 或AArch64 neg w0,w0 指令,由一个负载(a )/存储(c)中。在ARM32上,它是一个 rsb r3,r3,#0 (reverse-subtract: r3 = 0 - r3 )。

然而,clang5.0 -O0 不会做这种优化。它仍然使用 idiv 作为 a / -1 ,所以这两个版本都会在带有clang的x86上发生故障。为什么gcc优化呢?请参阅停用GCC中的所有优化选项。 gcc总是通过内部表示进行转换,而-O0只是生成二进制文件所需的最小工作量。它没有哑巴和文字模式,尽可能使源代码尽可能地像源代码。






< h3> x86 idiv 与AArch64 sdiv

x86-64:

 #int c =来自x86_fault()的a / b 
mov eax,DWORD PTR [rbp-4]
cdq#分红符号扩展到edx:eax
idiv DWORD PTR [rbp-8]#内存中的除数
mov DWORD PTR [rbp- 12],eax#store quotient

imul r32,r32 ,没有2操作数 idiv ,它没有分红上半部分输入。无论如何,这不重要; gcc只和 edx = eax 中的符号位副本一起使用,所以它确实在做32b / 32b => 32b商+余数。 如英特尔手册所述, idiv 引发#DE:


  • divisor = 0

  • 有符号结果(商)对于目的地。



如果您使用全范围的除数,则容易发生溢出,例如对于 int result = long long / int ,其中一个64b / 32b => 32b划分。但是gcc无法做到这一点,因为不允许编写错误的代码,而不是遵循C整数升级规则并执行64位除法,然后然后截断为 INT 。它也即使在除数已知足够大的情况下也不会优化不能 #DE

在做32b / 32b除法( cdq ),唯一可以溢出的输入是 INT_MIN / -1 。 正确的商是一个33位有符号整数,即正数 0x80000000 带有前导零符号位,使其成为正2的补码有符号整数。由于这不符合 eax idiv 引发了 #DE 异常。然后内核交付 SIGFPE



AArch64:

 #int c =来自x86_fault()的a / b(在AArch64上没有错误)
ldr w1,[sp,12]
ldr w0,[sp,8]#32位加载到32位寄存器中
sdiv w0,w1,w0#32/32 => AFAICT,ARM硬件部门,32位有符号划分
str w0,[sp,4]



指令不会引发零除或INT_MIN / -1的异常。或者至少,一些 ARM CPU不。
除以ARM OMAP3515处理器中的零例外



AArch64 sdiv 文档没有提及任何异常。然而,整数除法的软件实现可能引发: http:// infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka4061.html 。 (除非你设置了一个具有HW分区的-mcpu,否则默认情况下,gcc会使用一个库调用在ARM32上进行分区。)






C Undefined Behavior。



As PSkocik解释 INT_MIN / -1 在C中是未定义的行为,就像所有有符号的整数溢出一样。 这允许编译器在x86等机器上使用硬件划分指令,而不检查特殊情况。如果 错误,未知输入将需要运行时比较分支支票,并且没有人希望C要求这样做。






更多关于UB的后果:



启用优化时,编译器可以假定 a b a / b 运行时,c $ c>仍然有它们的设定值。然后它可以看到程序有未定义的行为,因此可以做任何想要的事情。 gcc选择从 -INT_MIN 中产生 INT_MIN

在2的补码系统中,最负数是它自己的负数。这对于2的补码来说是一个讨厌的角落案例,因为这意味着 abs(x)仍然可以是负数。
https://en.wikipedia.org/wiki/Two%27s_complement# Most_negative_number

  int x86_fault(){
int a = 0x80000000;
int b = -1;
int c = a / b;
return c;
}

使用 gcc6.3 -O3 for x86-64

  x86_fault:
mov eax,-2147483648
ret

clang5.0 -O3 编译为(即使使用-Wall -Wextra也没有警告):

  x86_fault:
ret

code>

未定义行为确实完全未定义。编译器可以做任何他们想做的事情,包括在函数入口处返回 eax 中的任何垃圾,或者加载NULL指针和非法指令。例如使用gcc6.3 -O3 for x86-64:

  int * local_address(int a){
return&一个;

$ b local_address:
xor eax,eax#return 0
ret

void foo(){
int * p = local_address(4);
* p = 2;

$ b foo:
mov DWORD PTR ds:0,0#将立即0存入绝对地址0
ud2#非法指令

您的 -O0 并没有让编译器看到UB编译时间,所以你得到了预期的输出。



另见每个C程序员应该知道的关于未定义的行为(与Basile链接的LLVM博客帖子相同)。


I have an assignment of expaining some seemingly strange behaviors of C code (running on x86). I can easily complete everything else but this one has really confused me.

Code snippet 1 outputs -2147483648

int a = 0x80000000;
int b = a / -1;
printf("%d\n", b);

Code snippet 2 outputs nothing, and gives a Floating point exception

int a = 0x80000000;
int b = -1;
int c = a / b;
printf("%d\n", c);

I well know the reason for the result of Code Snippet 1 (1 + ~INT_MIN == INT_MIN), but I can't quite understand how can integer division by -1 generate FPE, nor can I reproduce it on my Android phone (AArch64, GCC 7.2.0). Code 2 just output the same as Code 1 without any exceptions. Is it a hidden bug feature of x86 processor?

The assignment didn't tell anything else (including CPU architecture), but since the whole course is based on a desktop Linux distro, you can safely assume it's a modern x86.


Edit: I contacted my friend and he tested the code on Ubuntu 16.04 (Intel Kaby Lake, GCC 6.3.0). The result was consistent with whatever the assignment stated (Code 1 output the said thing and Code 2 crashed with FPE).

解决方案

There are four things going on here:

  • gcc -O0 behaviour explains the difference between your two versions. (While clang -O0 happens to compile them both with idiv). And why you get this even with compile-time-constant operands.
  • x86 idiv faulting behaviour vs. behaviour of the division instruction on ARM
  • If integer math results in a signal being delivered, POSIX require it to be SIGFPE: On which platforms does integer divide by zero trigger a floating point exception? But POSIX doesn't require trapping for any particular integer operation. (This is why it's allowed for x86 and ARM to be different).

    The Single Unix Specification defines SIGFPE as "Erroneous arithmetic operation". It's confusingly named after floating point, but in a normal system with the FPU in its default state, only integer math will raise it. On x86, only integer division. On MIPS, a compiler could use add instead of addu for signed math, so you could get traps on signed add overflow. (gcc uses addu even for signed, but an undefined-behaviour detector might use add.)

  • C Undefined Behaviour rules (signed overflow, and division specifically) which let gcc emit code which can trap in that case.

gcc with no options is the same as gcc -O0.

-O0 Reduce compilation time and make debugging produce the expected results. This is the default.

This explains the difference between your two versions:

Not only does gcc -O0 not try to optimize, it actively de-optimizes to make asm that independently implements each C statement within a function. This allows gdb's jump command to work safely, letting you jump to a different line within the function and act like you're really jumping around in the C source.

It also can't assume anything about variable values between statements, because you can change variables with set b = 4. This is obviously catastrophically bad for performance, which is why -O0 code runs several times slower than normal code, and why optimizing for -O0 specifically is total nonsense. It also makes -O0 asm output really noisy and hard for a human to read, because of all the storing/reloading, and lack of even the most obvious optimizations.

int a = 0x80000000;
int b = -1;
  // debugger can stop here on a breakpoint and modify b.
int c = a / b;        // a and b have to be treated as runtime variables, not constants.
printf("%d\n", c);

I put your code inside functions on the Godbolt compiler explorer to get the asm for those statements.

To evaluate a/b, gcc -O0 has to emit code to reload a and b from memory, and not make any assumptions about their value.

But with int c = a / -1;, you can't change the -1 with a debugger, so gcc can and does implement that statement the same way it would implement int c = -a;, with an x86 neg eax or AArch64 neg w0, w0 instruction, surrounded by a load(a)/store(c). On ARM32, it's a rsb r3, r3, #0 (reverse-subtract: r3 = 0 - r3).

However, clang5.0 -O0 doesn't do that optimization. It still uses idiv for a / -1, so both versions will fault on x86 with clang. Why does gcc "optimize" at all? See Disable all optimization options in GCC. gcc always transforms through an internal representation, and -O0 is just the minimum amount of work needed to produce a binary. It doesn't have a "dumb and literal" mode that tries to make the asm as much like the source as possible.


x86 idiv vs. AArch64 sdiv:

x86-64:

    # int c = a / b  from x86_fault()
    mov     eax, DWORD PTR [rbp-4]
    cdq                                 # dividend sign-extended into edx:eax
    idiv    DWORD PTR [rbp-8]           # divisor from memory
    mov     DWORD PTR [rbp-12], eax     # store quotient

Unlike imul r32,r32, there's no 2-operand idiv that doesn't have a dividend upper-half input. Anyway, not that it matters; gcc is only using it with edx = copies of the sign bit in eax, so it's really doing a 32b / 32b => 32b quotient + remainder. As documented in Intel's manual, idiv raises #DE on:

  • divisor = 0
  • The signed result (quotient) is too large for the destination.

Overflow can easily happen if you use the full range of divisors, e.g. for int result = long long / int with a single 64b / 32b => 32b division. But gcc can't do that optimization because it's not allowed to make code that would fault instead of following the C integer promotion rules and doing a 64-bit division and then truncating to int. It also doesn't optimize even in cases where the divisor is known to be large enough that it couldn't #DE

When doing 32b / 32b division (with cdq), the only input that can overflow is INT_MIN / -1. The "correct" quotient is a 33-bit signed integer, i.e. positive 0x80000000 with a leading-zero sign bit to make it a positive 2's complement signed integer. Since this doesn't fit in eax, idiv raises a #DE exception. The kernel then delivers SIGFPE.

AArch64:

    # int c = a / b  from x86_fault()  (which doesn't fault on AArch64)
    ldr     w1, [sp, 12]
    ldr     w0, [sp, 8]          # 32-bit loads into 32-bit registers
    sdiv    w0, w1, w0           # 32 / 32 => 32 bit signed division
    str     w0, [sp, 4]

AFAICT, ARM hardware division instructions don't raise exceptions for divide by zero or for INT_MIN/-1. Or at least, some ARM CPUs don't. divide by zero exception in ARM OMAP3515 processor

AArch64 sdiv documentation doesn't mention any exceptions.

However, software implementations of integer division may raise: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka4061.html. (gcc uses a library call for division on ARM32 by default, unless you set a -mcpu that has HW division.)


C Undefined Behaviour.

As PSkocik explains, INT_MIN / -1 is undefined behaviour in C, like all signed integer overflow. This allows compilers to use hardware division instructions on machines like x86 without checking for that special case. If it had to not fault, unknown inputs would require run-time compare-and branch checks, and nobody wants C to require that.


More about the consequences of UB:

With optimization enabled, the compiler can assume that a and b still have their set values when a/b runs. It can then see the program has undefined behaviour, and thus can do whatever it wants. gcc chooses to produce INT_MIN like it would from -INT_MIN.

On a 2's complement system, the most-negative number is its own negative. This is a nasty corner-case for 2's complement, because it means abs(x) can still be negative. https://en.wikipedia.org/wiki/Two%27s_complement#Most_negative_number

int x86_fault() {
    int a = 0x80000000;
    int b = -1;
    int c = a / b;
    return c;
}

compile to this with gcc6.3 -O3 for x86-64

x86_fault:
    mov     eax, -2147483648
    ret

but clang5.0 -O3 compiles to (with no warning even with -Wall -Wextra`):

x86_fault:
    ret

Undefined Behaviour really is totally undefined. Compilers can do whatever they feel like, including returning whatever garbage was in eax on function entry, or loading a NULL pointer and an illegal instruction. e.g. with gcc6.3 -O3 for x86-64:

int *local_address(int a) {
    return &a;
}

local_address:
    xor     eax, eax     # return 0
    ret

void foo() {
    int *p = local_address(4);
    *p = 2;
}

 foo:
   mov     DWORD PTR ds:0, 0     # store immediate 0 into absolute address 0
   ud2                           # illegal instruction

Your case with -O0 didn't let the compilers see the UB at compile time, so you got the "expected" asm output.

See also What Every C Programmer Should Know About Undefined Behavior (the same LLVM blog post that Basile linked).

这篇关于为什么整数除-1(负一)导致FPE?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆