内联x86程序集中的整数溢出未定义? [英] Is integer overflow undefined in inline x86 assembly?

查看:158
本文介绍了内联x86程序集中的整数溢出未定义?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有以下C代码:

  int32_t foo(int32_t x){
return x + 1 ;

$ / code>

x == INT_MAX 。现在说我使用内联汇编执行添加:

  int32_t foo(int32_t x){
asm(incl %0:+ g(x));
return x;

问题:<$ c $时,内联程序集版本仍会调用未定义的行为c> x == INT_MAX ?还是未定义的行为只适用于C代码?

解决方案

没有,没有UB 。 C规则不适用于asm指令本身。就包装指令的inline-asm语法而言,这是一个明确定义的语言扩展,它定义了支持它的实现的行为。 未定义的行为是否适用于asm代码?为这个问题的更通用版本(vs.这个关于x86汇编和GNU C inline语言扩展的文章)。其中的答案侧重于事物的C方面,引用了C和C ++标准,这些标准记录了标准对实现定义的语言扩展有多少意见。



<请参阅此comp.lang.c线程,争论是否有意义地说它具有UB一般,因为并非所有实现都具有该扩展。






BTW,如果您只想在GNU C中使用已定义的2的补码行为进行签名包装,请使用 -fwrapv 。不要使用内联汇编。 (或者使用 __ attribute __ 来为需要它的函数启用该选项。) wrapv 并不完全相同像 -fno-strict-overflow ,它仅仅基于假定程序没有任何UB而禁用优化;例如,编译时常量计算中的溢出只有在使用 -fwrapv 时才是安全的。






内联行为是实现定义的, GNU C inline asm被定义为编译器的黑盒子。输入输出,输出出来,编译器不知道如何。所有它知道的是你使用out / in / clobber约束来告诉它。






你的使用inline-asm的foo 的行为与

  int32_t foo(int32_t x){
uint32_t u = x;
return ++ u;
}

在x86上,因为x86是一个2的补码机器,所以整数环绕很好-defined。 (除了性能:asm版本打败了不断的传播,并且还使编译器无法优化 x - inc(x)到-1等等。 https://gcc.gnu.org/wiki/DontUseInlineAsm 除非无法哄骗编译器进入通过调整C来产生最佳的asm。)



它不会引发异常。设置OF标志对任何事情都没有影响,因为x86(i386和amd64)的GNU C inline asm有一个隐含的cc clobber,所以编译器会假设EFLAGS中的条件代码在每个内嵌asm语句后都保留垃圾。 gcc6引入了一个新的asm语法来生成标志结果(这可以在你的asm中保存一个SETCC,并且由编译器为希望返回一个标志条件的asm块生成一个TEST。)

有些体系结构会在整数溢出时引发异常(陷阱),但x86不是其中之一(除非分部商不符合目的地寄存器)。在MIPS上,您可以使用 ADDIU而不是ADDI 如果您希望它们能够在不陷印的情况下进行换行,则可以使用带符号整数。 (因为它也是2的补码ISA,所以有符号的环绕与二进制无符号环绕相同。)






未定义或者至少是依赖于实现的行为)在x86 asm中的行为:



BSF 和BSR(找到第一组正向或反向位)将未定义的内容保留在其目标寄存器中。 (TZCNT和LZCNT没有这个问题)。英特尔最近的x86 CPU确实定义了这种行为,即不修改目标,但x86手册不保证这一点。请参阅这个答案,以获得更多有关含义的讨论,例如TZCNT / LZCNT / POPCNT对Intel CPU输出的错误依赖。

其他一些指令在某些/所有情况下都会留下一些未定义的标志。 (特别是AF / PF)。例如, IMUL 会使ZF,PF和AF未定义。



推测任何给定的CPU都有一致的行为,但重要的是其他CPU可能行为不同,即使它们仍然是x86。如果你是微软,英特尔将设计他们未来的CPU,不会破坏你现有的代码。如果你的代码是广泛依赖的,那么你最好只依赖手册中记录的行为,而不仅仅是你的CPU所做的事情。请参阅 Andy Glew的回答和评论。 Andy是英特尔P6微架构的设计师之一。



这些例子与C 中的UB不同>。它们更像是C所称的实现定义,因为我们只是在谈论一个未指定的值,而不是鼻子恶魔。 (或更合理的修改其他寄存器,或跳到某处)。

对于真正未定义的行为,您可能需要查看特权指令或至少多线程代码。自修改代码在x86上也可能是UB:不保证CPU通知存储到即将执行的地址,直到执行跳转指令。这是上面链接的问题(答案是:x86的真正实现超出了x86 ISA手册的要求,支持依赖于它的代码,并且因为始终监听对于高性能来说比对跳转来说更好)。



在汇编语言中未定义的行为是非常罕见的,尤其是如果您不计算特定值未指定的情况,但损害的范围是可预测和有限的。 / p>

Say I have the following C code:

int32_t foo(int32_t x) {
    return x + 1;
}

This is undefined behavior when x == INT_MAX. Now say I performed the addition with inline assembly instead:

int32_t foo(int32_t x) {
    asm("incl %0" : "+g"(x));
    return x;
}

Question: Does the inline assembly version still invoke undefined behavior when x == INT_MAX? Or does undefined behavior only apply to the C code?

解决方案

No, there's no UB with this. C rules don't apply to the asm instructions themselves. As far as the inline-asm syntax wrapping the instructions, that's a well-defined language extension that has defined behaviour on implementations that support it.

See Does undefined behavior apply to asm code? for a more generic version of this question (vs. this one about x86 assembly and the GNU C inline asm language extension). The answers there focus on the C side of things, with quotes from the C and C++ standards that document how little the standard has to say about implementation-defined extensions to the language.

See also this comp.lang.c thread for arguments about whether it makes sense to say it has UB "in general" because not all implementations have that extension.


BTW, if you just want signed wraparound with defined 2's complement behaviour in GNU C, compile with -fwrapv. Don't use inline asm. (Or use an __attribute__ to enable that option for just the function that needs it.) wrapv is not quite the same thing as -fno-strict-overflow, which merely disables optimizations based on assuming the program doesn't have any UB; for example, overflow in compile-time-constant calculations is only safe with -fwrapv.


Inline-asm behaviour is implementation defined, and GNU C inline asm is defined as a black box for the compiler. Inputs go in, outputs come out, and the compiler doesn't know how. All it knows is what you tell it using the out/in/clobber constraints.


Your foo that uses inline-asm behaves identically to

int32_t foo(int32_t x) {
    uint32_t u = x;
    return ++u;
}

on x86, because x86 is a 2's complement machine, so integer wraparound is well-defined. (Except for performance: the asm version defeats constant propagation, and also gives the compiler no ability to optimize x - inc(x) to -1, etc. etc. https://gcc.gnu.org/wiki/DontUseInlineAsm unless there's no way to coax the compiler into generating optimal asm by tweaking the C.)

It doesn't raise exceptions. Setting the OF flag has no impact on anything, because GNU C inline asm for x86 (i386 and amd64) has an implicit "cc" clobber, so the compiler will assume that the condition codes in EFLAGS hold garbage after every inline-asm statement. gcc6 introduced a new syntax for asm to produce flag results (which can save a SETCC in your asm and a TEST generated by the compiler for asm blocks that want to return a flag condition).

Some architectures do raise exceptions (traps) on integer overflow, but x86 is not one of them (except when a division quotient doesn't fit in the destination register). On MIPS, you'd use ADDIU instead of ADDI on signed integers if you wanted them to be able to wrap without trapping. (Because it's also a 2's complement ISA, so signed wraparound is the same in binary as unsigned wraparound.)


Undefined (or at least implementation-dependent) Behaviour in x86 asm:

BSF and BSR (find first set bit forward or reverse) leave their destination register with undefined contents if the input was zero. (TZCNT and LZCNT don't have that problem). Intel's recent x86 CPUs do define the behaviour, which is to leave the destination unmodified, but the x86 manuals don't guarantee that. See the section on TZCNT in this answer for more discussion on the implications, e.g. that TZCNT/LZCNT/POPCNT have a false dependency on the output in Intel CPUs.

Several other instructions leave some flags undefined in some/all cases. (especially AF/PF). IMUL for example leaves ZF, PF, and AF undefined.

Presumably any given CPU has consistent behaviour, but the point is that other CPUs might behave differently even though they're still x86. If you're Microsoft, Intel will design their future CPUs to not break your existing code. If your code is that widely-relied-on, you'd better stick to only relying on behaviour documented in the manuals, not just what your CPU happens to do. See Andy Glew's answer and comments here. Andy was one of the architects of Intel's P6 microarchitecture.

These examples are not the same thing as UB in C. They're more like what C would call "implementation defined", since we're just talking about one value that's unspecified, not the possibility of nasal demons. (Or the more plausible modifying other registers, or jumping somewhere).

For really undefined behaviour, you probably need to look at privileged instructions, or at least multi-threaded code. Self-modifying code is also potentially UB on x86: it's not guaranteed that the CPU "notices" stores to addresses that are about to be executed until after a jump instruction. This was the subject of the question linked above (and the answer is: real implementations of x86 go above and beyond what the x86 ISA manual requires, to support code that depends on it, and because snooping all the time is better for high-performance than flushing on jumps.)

Undefined behaviour in assembly language is pretty rare, especially if you don't count cases where a specific value is unspecified but the scope of the "damage" is predictable and limited.

这篇关于内联x86程序集中的整数溢出未定义?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆