为什么编译器不再使用严格的别名来优化这个UB [英] Why compilers no longer optimize this UB with strict aliasing

查看:221
本文介绍了为什么编译器不再使用严格的别名来优化这个UB的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

谷歌严格别名的首批结果之一是这篇文章
http:// dbp-consulting。 com / tutorials / StrictAliasing.html

我注意到了一件有趣的事情: http://goo.gl/ lptIa5

  uint32_t swaphalves(uint32_t a){
uint32_t acopy = a;
uint16_t * ptr =(uint16_t *)& acopy;
uint16_t tmp = ptr [0];
ptr [0] = ptr [1];
ptr [1] = tmp;
return acopy;



$ b $ p $编译为

$ b pre> swaphalves(unsigned int):
mov eax,edi
ret

由GCC 4.4.7。任何比此更新的编译器(本文中提到4.4,因此文章没有错)不会实现该函数,因为它可以使用严格的别名。
这是什么原因?
实际上是GCC中的错误,或者GCC决定放弃它,因为许多代码行都是以产生UB的方式编写的,或者它只是一个持续数年的编译器回归...
另外铿锵不会优化它。

解决方案

GCC开发人员在这些情况下努力使编译器的行为如预期一样。 (我希望我可以给你一个适当的参考 - 我记得它在某个时间出现在邮件列表中或某些时候)。



无论如何,说:


...没有实现该功能,因为它可以使用严格别名


...意味着对严格的锯齿规则是什么会产生轻微的误解。您的代码示例调用未定义的行为 - 因此任何编译在技术上都是有效的,包括一个简单的 ret 或生成陷阱指令,甚至什么都没有(假定该方法永远不会被调用是合理的)。 GCC的较新版本生成较长/较慢的代码几乎不是一个缺陷,因为生成代码完全不会违反标准。实际上,新版本通过生成代码来改善情况,这些代码可以完成程序员可能希望代码执行的操作,而不是默默地做一些不同的事情。



你会更喜欢什么 - 编译器生成的快速代码不符合你的要求,或者稍微慢一点的代码就做你想做的事情?

我坚信,你不应该编写违反严格别名规则的代码。依靠编译器做正确的事情时,它是明显的意图是走钢丝。编译器不必猜测 - 并且考虑到程序员的意图,已经足够优化了。此外,可以编写遵守规则并可由编译器转换为非常有效的目标代码的代码。事实上,可以提出进一步的问题:为什么早期版本的GCC按照他们的方式行事,并且通过依靠优化功能遵守严格的别名规则?

这有点复杂,但对于这个讨论很有趣(尤其是考虑到编译器会花费一些时间破解代码)。严格的别名是一个称为别名分析的进程的一个组成部分(或者说是一种辅助规则)。这个过程决定两个指针是否是别名。在任何两个指针之间基本上有3种可能的条件:它们不能别名(严格的别名规则很容易推断出这一点)条件,尽管它有时可以用其他方式推导出来)。

  • 它们必须是别名(这需要分析;值传播可能会检测到这种情况)例如
    <李>他们可能是别人。这是默认情况下,其他两个条件都不能建立。



  • 在您的问题中的代码严格别名意味着& acopy ptr 之间的MUST NOT ALIAS条件(做出这个决定是微不足道的,因为两个值具有不兼容的类型,不允许别名)。这个条件允许你看到的优化:所有对 * ptr 值的操作都可以被丢弃,因为它们理论上不能影响 acopy 并且它们不会以其他方式转义该函数(可以通过转义分析确定)。

    努力确定两个指针之间的MUST ALIAS条件。此外,编译器在这样做时需要忽略(至少暂时)先前确定的MUST NOT ALIAS条件,这意味着它必须花时间试图确定条件的真实性,如果所有条件都应该如此,必须是false。



    当两者都不是别名并且必须是别名条件确定时,我们有一个情况,代码必须调用未定义的行为(并且我们可以发出警告)。然后,我们必须决定保留哪种条件以及放弃哪种条件。因为在这种情况下,不可以是别名,而是来自用户可能(并确实已被)破坏的约束,所以它是放弃的最佳选择。



    因此,旧版本的GCC要么不做必要的分析来确定必须别名条件(可能是因为相反的不得别名条件已经建立),要么,旧的GCC版本选择放弃必须别名条件偏好于不得别名条件,这导致更快的代码不会执行程序员最可能预期的操作。无论哪种情况,新版本似乎都有所改进。


    One of the first results for strict aliasing on google is this article http://dbp-consulting.com/tutorials/StrictAliasing.html
    One interesting thing I noticed is this: http://goo.gl/lPtIa5

    uint32_t swaphalves(uint32_t a) {
      uint32_t acopy = a;
      uint16_t* ptr = (uint16_t*)&acopy;
      uint16_t tmp = ptr[0];
      ptr[0] = ptr[1];
      ptr[1] = tmp;
      return acopy;
    }
    

    is compiled to

    swaphalves(unsigned int):
            mov     eax, edi
            ret
    

    by GCC 4.4.7. Any compiler newer than that (4.4 is mentioned in the article so article is not wrong) does not implement the function as it could using strict aliasing. What is the reason for this? Was it in fact bug in GCC or GCC decided to drop it since many lines of code were written in a way that thay produce UB or it is just a compiler regression that lasts for years... Also Clang does not optimize it.

    解决方案

    The GCC developers put some effort into making the compiler behave "as expected" in these cases. (I wish I could give you a proper reference for this - I remember it coming up on a mailing list or somesuch at some time).

    At any rate, something you say:

    ... does not implement the function as it could using strict aliasing

    ... implies perhaps a slight misunderstanding of what the strict aliasing rules are for. Your code sample invokes undefined behavior - so any compilation is technically valid, including just a plain ret or generation of a trap instruction, or even nothing at all (it's legitimate to assume the method can never be called). That newer versions of GCC produce longer/slower code is hardly a deficiency, since producing code that does any particular thing at all would not violate the standard. In fact, the newer versions improve the situation by producing code that does what the programmer probably intended the code to do instead of silently doing something differerent.

    What would you rather - that the compiler produces fast code that doesn't do what you want, or slightly slower code that does do what you want?

    That being said, I firmly believe that you should not write code that breaks the strict aliasing rules. Relying on the compiler doing the "right" thing when it is "obvious" what is intended is walking a tightrope. Optimisation is hard enough already, without the compiler having to guess at - and make allowances for - what the programmer intended. Further, it's possible to write code which obeys the rules and which can be turned into very efficient object code by the compiler. Indeed the further question can be raised:

    Why did earlier versions of GCC behave the way they did, and "optimize" the function by relying on adherence to the strict aliasing rules?

    That's a little bit complicated, but is interesting for this discussion (especially in light of suggestions that the compiler is going to some lengths just to break code). Strict aliasing is a component of (or rather, a rule which assists) a process called alias analysis. This process decides whether two pointers alias or not. There are, essentially, 3 possible conditions between any two pointers:

    • They MUST NOT ALIAS (the strict aliasing rule makes it easy to deduce this condition, though it can sometimes be deduced in other ways).
    • They MUST ALIAS (this requires analysis; value propagation might detect this condition for instance)
    • They MAY ALIAS. This is the default condition when neither of the other two conditions can be established.

    In the case of the code in your question, strict aliasing implies a MUST NOT ALIAS condition between &acopy and ptr (it is trivial to make this determination, because the two values have incompatible types which are not allowed to alias). This condition allows for the optimisation that you then see: all the manipulation of *ptr values can be discarded because they cannot in theory effect the value of acopy and they do not otherwise escape the function (which can be determined via escape analysis).

    It takes further effort to determine the MUST ALIAS condition between the two pointers. Furthermore, in doing so the compiler would need to ignore (at least temporarily) the previously ascertained MUST NOT ALIAS condition, which means it must spend time attempting to ascertain the truth of a condition which, if everything is as it should be, must be false.

    When both MUST NOT ALIAS and MUST ALIAS conditions are determined, we have a case where the code must be invoking undefined behaviour (and we can issue a warning). We then have to decide which condition to keep and which to discard. Because MUST NOT ALIAS, in this case, comes from a constraint which can be (and indeed has been) broken by the user, it is the best option to discard.

    So, the older versions of GCC either do not do the requisite analysis to determine the MUST ALIAS condition (perhaps because the opposite MUST NOT ALIAS condition has already been established), or alternatively, the older GCC version opts to discard the MUST ALIAS condition in preference to the MUST NOT ALIAS condition, which leads to faster code which does not do what the programmer most likely intended. In either case, it seems that the newer versions offer an improvement.

    这篇关于为什么编译器不再使用严格的别名来优化这个UB的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆