为什么整数除以 -1(负一)会导致 FPE? [英] Why does integer division by -1 (negative one) result in FPE?

查看:43
本文介绍了为什么整数除以 -1(负一)会导致 FPE?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的任务是解释 C 代码(在 x86 上运行)的一些看似奇怪的行为.我可以轻松完成其他所有事情,但这个让我很困惑.

I have an assignment of expaining some seemingly strange behaviors of C code (running on x86). I can easily complete everything else but this one has really confused me.

代码片段 1 输出 -2147483648

int a = 0x80000000;
int b = a / -1;
printf("%d\n", b);

代码片段 2 不输出任何内容,并给出一个浮点异常

Code snippet 2 outputs nothing, and gives a Floating point exception

int a = 0x80000000;
int b = -1;
int c = a / b;
printf("%d\n", c);

我很清楚代码片段 1 (1 + ~INT_MIN == INT_MIN) 的结果的原因,但我不太明白整数除以 -1 如何生成 FPE,也不我可以在我的 Android 手机(AArch64、GCC 7.2.0)上重现它吗?代码 2 的输出与代码 1 相同,没有任何异常.它是 x86 处理器隐藏的 bug 特性吗?

I well know the reason for the result of Code Snippet 1 (1 + ~INT_MIN == INT_MIN), but I can't quite understand how can integer division by -1 generate FPE, nor can I reproduce it on my Android phone (AArch64, GCC 7.2.0). Code 2 just output the same as Code 1 without any exceptions. Is it a hidden bug feature of x86 processor?

作业没有说明其他任何内容(包括 CPU 架构),但由于整个课程基于桌面 Linux 发行版,您可以放心地假设它是现代 x86.

The assignment didn't tell anything else (including CPU architecture), but since the whole course is based on a desktop Linux distro, you can safely assume it's a modern x86.

编辑:我联系了我的朋友,他在 Ubuntu 16.04(Intel Kaby Lake,GCC 6.3.0)上测试了代码.结果与任何指定的赋值一致(代码 1 输出所说的东西,代码 2 与 FPE 崩溃).

Edit: I contacted my friend and he tested the code on Ubuntu 16.04 (Intel Kaby Lake, GCC 6.3.0). The result was consistent with whatever the assignment stated (Code 1 output the said thing and Code 2 crashed with FPE).

推荐答案

这里有四件事:

  • gcc -O0 行为解释了您的两个版本之间的区别:idivneg.(虽然 clang -O0 碰巧用 idiv 编译了它们).以及为什么即使使用编译时常量操作数也会得到这个.

  • gcc -O0 behaviour explains the difference between your two versions: idiv vs. neg. (While clang -O0 happens to compile them both with idiv). And why you get this even with compile-time-constant operands.

x86 idiv 错误行为与行为ARM上的除法指令

x86 idiv faulting behaviour vs. behaviour of the division instruction on ARM

如果整数数学导致信号被传递,POSIX 要求它是 SIGFPE:在哪些平台上整数除以零触发浮点异常?但POSIX不会 需要捕获任何特定的整数运算.(这就是为什么允许 x86 和 ARM 不同的原因).

If integer math results in a signal being delivered, POSIX require it to be SIGFPE: On which platforms does integer divide by zero trigger a floating point exception? But POSIX doesn't require trapping for any particular integer operation. (This is why it's allowed for x86 and ARM to be different).

单一 Unix 规范将 SIGFPE 定义为错误的算术运算".它以浮点命名令人困惑,但在 FPU 处于默认状态的正常系统中,只有整数数学会提高它.在 x86 上,只有整数除法.在 MIPS 上,编译器可以使用 add 而不是 addu 用于有符号数学,因此您可能会在有符号加法溢出时遇到陷阱.(gcc 使用 addu 甚至签名,但是未定义行为检测器可能会使用 add.)

The Single Unix Specification defines SIGFPE as "Erroneous arithmetic operation". It's confusingly named after floating point, but in a normal system with the FPU in its default state, only integer math will raise it. On x86, only integer division. On MIPS, a compiler could use add instead of addu for signed math, so you could get traps on signed add overflow. (gcc uses addu even for signed, but an undefined-behaviour detector might use add.)

C 未定义行为规则(有符号溢出,特别是除法),它让 gcc 发出可以在这种情况下捕获的代码.

C Undefined Behaviour rules (signed overflow, and division specifically) which let gcc emit code which can trap in that case.

没有选项的gcc与gcc -O0相同.

gcc with no options is the same as gcc -O0.

-O0减少编译时间并使调试产生预期的结果.这是默认设置.

-O0 Reduce compilation time and make debugging produce the expected results. This is the default.

这解释了您的两个版本之间的区别:

This explains the difference between your two versions:

gcc -O0 不仅不尝试优化,而是主动反优化,使 asm 在函数内独立实现每个 C 语句.这允许 gdbjump 命令 以安全地工作,让您跳转到函数内的不同行,并表现得就像您真的在 C 源代码中跳转一样.为什么clang会产生低效的asm与 -O0(对于这个简单的浮点和)? 解释了更多关于 -O0 如何以及为什么编译它的方式.

Not only does gcc -O0 not try to optimize, it actively de-optimizes to make asm that independently implements each C statement within a function. This allows gdb's jump command to work safely, letting you jump to a different line within the function and act like you're really jumping around in the C source. Why does clang produce inefficient asm with -O0 (for this simple floating point sum)? explains more about how and why -O0 compiles the way it does.

它也不能对语句之间的变量值做任何假设,因为您可以使用 set b = 4 更改变量.这显然对性能来说是灾难性的,这就是为什么 -O0 代码运行速度比普通代码慢几倍,以及为什么 专门针对 -O0 进行优化完全是无稽之谈.它还使 -O0 asm 输出 由于所有的存储/重新加载,甚至缺乏最明显的优化,因此对于人类来说真的很嘈杂且难以阅读.

It also can't assume anything about variable values between statements, because you can change variables with set b = 4. This is obviously catastrophically bad for performance, which is why -O0 code runs several times slower than normal code, and why optimizing for -O0 specifically is total nonsense. It also makes -O0 asm output really noisy and hard for a human to read, because of all the storing/reloading, and lack of even the most obvious optimizations.

int a = 0x80000000;
int b = -1;
  // debugger can stop here on a breakpoint and modify b.
int c = a / b;        // a and b have to be treated as runtime variables, not constants.
printf("%d\n", c);

我把你的代码放在 Godbolt 上的函数中 编译器浏览器 获取这些语句的 asm.

I put your code inside functions on the Godbolt compiler explorer to get the asm for those statements.

要评估 a/bgcc -O0 必须发出代码以重新加载 ab记忆,不对它们的价值做任何假设.

To evaluate a/b, gcc -O0 has to emit code to reload a and b from memory, and not make any assumptions about their value.

但是int c = a/-1;,你不能用调试器改变-1,所以gcc可以并以与实现 int c = -a; 相同的方式实现该语句,使用 x86 neg eax 或 AArch64 neg w0, w0指令,被 load(a)/store(c) 包围.在 ARM32 上,它是一个 rsb r3, r3, #0(反向减法:r3 = 0 - r3).

But with int c = a / -1;, you can't change the -1 with a debugger, so gcc can and does implement that statement the same way it would implement int c = -a;, with an x86 neg eax or AArch64 neg w0, w0 instruction, surrounded by a load(a)/store(c). On ARM32, it's a rsb r3, r3, #0 (reverse-subtract: r3 = 0 - r3).

然而,clang5.0 -O0 并没有做这种优化.它仍然使用 idiv 作为 a/-1,所以两个版本都会在 x86 上用 clang 出错.gcc为什么要优化"?根本?请参阅禁用 GCC 中的所有优化选项.gcc 总是通过内部表示进行转换,而 -O0 只是生成二进制文件所需的最少工作量.它没有哑巴和文字"试图使 asm 尽可能像源代码的模式.

However, clang5.0 -O0 doesn't do that optimization. It still uses idiv for a / -1, so both versions will fault on x86 with clang. Why does gcc "optimize" at all? See Disable all optimization options in GCC. gcc always transforms through an internal representation, and -O0 is just the minimum amount of work needed to produce a binary. It doesn't have a "dumb and literal" mode that tries to make the asm as much like the source as possible.

x86-64:

    # int c = a / b  from x86_fault()
    mov     eax, DWORD PTR [rbp-4]
    cdq                                 # dividend sign-extended into edx:eax
    idiv    DWORD PTR [rbp-8]           # divisor from memory
    mov     DWORD PTR [rbp-12], eax     # store quotient

imul r32,r32 不同,没有没有被除数上半部分输入的 2 操作数 idiv.无论如何,这并不重要;gcc 仅将它与 edx = eax 中符号位的副本一起使用,因此它实际上是在执行 32b/32b =>32b 商 + 余数.如英特尔手册中所述idiv 在以下方面引发 #DE:

Unlike imul r32,r32, there's no 2-operand idiv that doesn't have a dividend upper-half input. Anyway, not that it matters; gcc is only using it with edx = copies of the sign bit in eax, so it's really doing a 32b / 32b => 32b quotient + remainder. As documented in Intel's manual, idiv raises #DE on:

  • 除数 = 0
  • 签名结果(商)对于目标来说太大了.

如果你使用全范围的除数,很容易发生溢出,例如对于 int result = long long/int 带有单个 64b/32b =>32b师.但是 gcc 无法进行这种优化,因为不允许编写会出错的代码,而不是遵循 C 整数提升规则并进行 64 位除法,然后 then 截断为 int.它也即使在已知除数足够大的情况下也不会优化不能#DE

Overflow can easily happen if you use the full range of divisors, e.g. for int result = long long / int with a single 64b / 32b => 32b division. But gcc can't do that optimization because it's not allowed to make code that would fault instead of following the C integer promotion rules and doing a 64-bit division and then truncating to int. It also doesn't optimize even in cases where the divisor is known to be large enough that it couldn't #DE

做32b/32b除法时(用cdq),唯一能溢出的输入是INT_MIN/-1.正确"商是一个 33 位有符号整数,即正 0x80000000 带有前导零符号位,使其成为正 2 的有符号整数补码.由于这不适合 eaxidiv 会引发 #DE 异常.然后内核提供SIGFPE.

When doing 32b / 32b division (with cdq), the only input that can overflow is INT_MIN / -1. The "correct" quotient is a 33-bit signed integer, i.e. positive 0x80000000 with a leading-zero sign bit to make it a positive 2's complement signed integer. Since this doesn't fit in eax, idiv raises a #DE exception. The kernel then delivers SIGFPE.

AArch64:

    # int c = a / b  from x86_fault()  (which doesn't fault on AArch64)
    ldr     w1, [sp, 12]
    ldr     w0, [sp, 8]          # 32-bit loads into 32-bit registers
    sdiv    w0, w1, w0           # 32 / 32 => 32 bit signed division
    str     w0, [sp, 4]

ARM 硬件除法指令不会引发除以零或 INT_MIN/-1 溢出的异常.Nate Eldredge 评论道:

ARM hardware division instructions don't raise exceptions for divide by zero or for INT_MIN/-1 overflow. Nate Eldredge commented:

完整的 ARM 架构参考手册指出,UDIV 或 SDIV 在被零除时,只会返回零作为结果,没有任何迹象表明发生了被零除".(Armv8-A 版本中的 C3.4.8).没有例外也没有标志——如果你想捕获除以零,你必须编写一个显式测试.同样,INT_MIN 除以 -1 会返回 INT_MIN 而没有溢出迹象.

The full ARM architecture reference manual states that UDIV or SDIV, when dividing by zero, simply return zero as the result, "without any indication that the division by zero occurred" (C3.4.8 in the Armv8-A version). No exceptions and no flags - if you want to catch divide by zero, you have to write an explicit test. Likewise, signed divide of INT_MIN by -1 returns INT_MIN with no indication of the overflow.

AArch64 sdiv 文档 没有提到任何例外.

AArch64 sdiv documentation doesn't mention any exceptions.

然而,整数除法的软件实现可能会引发:http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka4061.html.(默认情况下,gcc 在 ARM32 上使用库调用进行除法,除非您设置了具有硬件除法的 -mcpu.)

However, software implementations of integer division may raise: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.faqs/ka4061.html. (gcc uses a library call for division on ARM32 by default, unless you set a -mcpu that has HW division.)

正如 PSkocik 解释的那样, INT_MIN/-1 在 C 中是未定义的行为,就像所有有符号整数溢出一样.这允许编译器在 x86 等机器上使用硬件除法指令而无需检查这种特殊情况.如果它必须出错,未知输入将需要运行时比较和分支检查,没有人希望 C 要求这样做.

As PSkocik explains, INT_MIN / -1 is undefined behaviour in C, like all signed integer overflow. This allows compilers to use hardware division instructions on machines like x86 without checking for that special case. If it had to not fault, unknown inputs would require run-time compare-and branch checks, and nobody wants C to require that.

更多关于 UB 的后果:

More about the consequences of UB:

启用优化后,编译器可以假设 aba/b运行.然后它可以看到程序有未定义的行为,因此可以做任何它想做的事情.gcc 选择从 -INT_MIN 生成 INT_MIN.

With optimization enabled, the compiler can assume that a and b still have their set values when a/b runs. It can then see the program has undefined behaviour, and thus can do whatever it wants. gcc chooses to produce INT_MIN like it would from -INT_MIN.

在 2 的补码系统中,最负数是它自己的负数.对于 2 的补码来说,这是一个令人讨厌的极端情况,因为这意味着 abs(x) 仍然可以是负数.https://en.wikipedia.org/wiki/Two%27s_complement#Most_negative_number

On a 2's complement system, the most-negative number is its own negative. This is a nasty corner-case for 2's complement, because it means abs(x) can still be negative. https://en.wikipedia.org/wiki/Two%27s_complement#Most_negative_number

int x86_fault() {
    int a = 0x80000000;
    int b = -1;
    int c = a / b;
    return c;
}

使用 gcc6.3 -O3 for x86-64

compile to this with gcc6.3 -O3 for x86-64

x86_fault:
    mov     eax, -2147483648
    ret

但是 clang5.0 -O3 编译为(即使使用 -Wall -Wextra` 也没有警告):

but clang5.0 -O3 compiles to (with no warning even with -Wall -Wextra`):

x86_fault:
    ret

未定义行为真的是完全未定义的.编译器可以做任何他们想做的事情,包括在函数入口返回 eax 中的任何垃圾,或者加载 NULL 指针和非法指令.例如使用 gcc6.3 -O3 for x86-64:

Undefined Behaviour really is totally undefined. Compilers can do whatever they feel like, including returning whatever garbage was in eax on function entry, or loading a NULL pointer and an illegal instruction. e.g. with gcc6.3 -O3 for x86-64:

int *local_address(int a) {
    return &a;
}

local_address:
    xor     eax, eax     # return 0
    ret

void foo() {
    int *p = local_address(4);
    *p = 2;
}

 foo:
   mov     DWORD PTR ds:0, 0     # store immediate 0 into absolute address 0
   ud2                           # illegal instruction

你的 -O0 案例没有让编译器在编译时看到 UB,所以你得到了预期的"汇编输出.

Your case with -O0 didn't let the compilers see the UB at compile time, so you got the "expected" asm output.

另见 每个 C 程序员应该做什么了解未定义行为(Basile 链接的同一篇 LLVM 博客文章).

See also What Every C Programmer Should Know About Undefined Behavior (the same LLVM blog post that Basile linked).

这篇关于为什么整数除以 -1(负一)会导致 FPE?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆