如果您在多个平台上部署,未定义的行为是否只是一个问题? [英] Is undefined behavior only an issue if you are deploying on several platforms?

查看:110
本文介绍了如果您在多个平台上部署,未定义的行为是否只是一个问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

未定义的行为(UB)中的大多数对话都讨论了有哪些平台可以做到这一点,或者一些编译器会这样做。



如果你只对一个平台和一个编译器(同一版本)感兴趣,并且你知道你将使用它们多年呢?



没有什么改变,代码和UB不是实现定义的。



一旦UB已经体现了那个架构和编译器并且你已经测试,你不能假设从

注意:我知道未定义的行为是非常非常糟糕的,但是当我在这种情况下由某人写的代码中指出UB,他们问这个,我没有什么比这更好的,如果你必须升级或端口,所有的UB将是非常昂贵的修复。



似乎有不同的行为类别:



<
  • 定义 - 这是被标准工作的行为

  • / code> - 这是被记录支持的行为a
    实现定义

  • 扩展文档添加,支持低级
    位操作,例如 popcount ,分支提示,属于此类别

  • Constant - 虽然没有记录,但是这些是行为,
    可能在给定平台上是一致的,例如字节序,
    sizeof int 虽然不可移植,但很可能不会改变

  • 合理 - 通常安全,通常是遗产,从
    转换为unsigned to signed,使用指针的低位作为临时空间

  • 危险 - 读取未初始化或未分配的内存,在非pod类上使用 memcopy 返回
    a临时变量

  • 看起来 Constant 在一个平台上的补丁版本中可能是不变的。 合理危险之间的界线似乎越来越多地向

    解决方案

    操作系统更改,无害的系统更改(不同的硬件版本! ,或编译器更改都可能导致以前的工作UB无法正常工作。



    但是比这更糟糕。



    有时,更改为不相关的编译单元,或在同一编译单元中的远处代码,可能导致以前的工作UB无法工作;作为示例,具有不同定义但具有相同签名的两个内联函数或方法。一个在连接期间被静默丢弃;



    在一个上下文中工作的代码在使用时可能会突然停止在同一个编译器,操作系统和硬件中工作。它在不同的上下文。一个例子是违反强混叠;编译代码可能在现场A调用时工作,但是当内联(可能在链接时!)代码可以改变含义。



    您的代码,如果更大的项目,可以有条件地调用一些第三方代码(例如,在文件打开对话框中预览图像类型的shell扩展),它改变一些标志的状态(浮点精度,区域设置,整数溢出标志,等等)。你的代码,以前工作正常,现在表现出完全不同的行为。



    接下来,许多种未定义的行为本质上是非确定性的。访问指针的内容之后,它被释放(甚至写入它)可能是安全的99/100,但页面被交换出来,或其他东西写在那里,然后才得到它的1/100。现在你有内存损坏。



    通过使用未定义的行为,您可以承诺完全理解C ++标准,一切都是完整的你的编译器可以在这种情况下,并且运行时环境的每一种方式都可以做出反应。你必须审计生成的程序集,而不是C ++源代码,可能是整个程序,每次你构建它!您还可以将读取该代码或修改该代码的人提交给该级别的知识。



    有时它还是值得的。



    可能的最快代表使用UB和关于调用约定的知识是一个非常快的非拥有的 std :: function -like类型。



    不可能快代表竞争。它在某些情况下更快,在其他情况下更慢,并且符合C ++标准。



    使用UB可能是值得的,为了提高性能。除了性能(速度或内存使用)之外,你很少能从这样的UB hackery获得。



    我看到的另一个例子是当我们注册一个回调有一个穷的C API只是一个函数指针。我们将创建一个函数(没有优化编译),将它复制到另一个页面,修改该函数中的指针常量,然后将该页面标记为可执行,允许我们秘密地传递一个指针和函数指针一起传递给回调。 / p>

    另一种实现方式是使用一些固定大小的函数集(10?100?1000?100万?),所有这些函数都会查找一个 std :: function 在全局数组中并调用它。这将限制我们在任何一个时间安装多少回调,但实际上是足够的。


    Most of the conversations around undefined behavior (UB) talk about how there are some platforms that can do this, or some compilers do that.

    What if you are only interested in one platform and only one compiler (same version) and you know you will be using them for years?

    Nothing is changing but the code, and the UB is not implementation-defined.

    Once the UB has manifested for that architecture and that compiler and you have tested, can't you assume that from then on whatever the compiler did with the UB the first time, it will do that every time?

    Note: I know undefined behavior is very, very bad, but when I pointed out UB in code written by somebody in this situation, they asked this, and I didn't have anything better to say than, if you ever have to upgrade or port, all the UB will be very expensive to fix.

    It seems there are different categories of Behavior:

    1. Defined - This is behavior documented to work by the standards
    2. Supported - This is behavior documented to be supported a.k.a implementation defined
    3. Extensions - This is a documented addition, support for low level bit operations like popcount, branch hints, fall into this category
    4. Constant - While not documented, these are behaviors that will likely be consistent on a given platform things like endianness, sizeof int while not portable are likely to not change
    5. Reasonable - generally safe and usually legacy, casting from unsigned to signed, using the low bit of a pointer as temp space
    6. Dangerous - reading uninitialized or unallocated memory, returning a temp variable, using memcopy on a non pod class

    It would seem that Constant might be invariant within a patch version on one platform. The line between Reasonable and Dangerous seems to be moving more and more behavior towards Dangerous as compilers become more aggressive in their optimizations

    解决方案

    OS changes, innocuous system changes (different hardware version!), or compiler changes can all cause previously "working" UB to not work.

    But it is worse than that.

    Sometimes a change to an unrelated compilation unit, or far away code in the same compilation unit, can cause previously "working" UB to not work; as an example, two inline functions or methods with different definitions but the same signature. One is silently discarded during linking; and completely innocuous code changes can change which one is discarded.

    The code that is working in one context can suddenly stop working in the same compiler, OS and hardware when you use it in a different context. An example of this is violating strong aliasing; the compiled code might work when called at spot A, but when inlined (possibly at link-time!) the code can change meaning.

    Your code, if part of a larger project, could conditionally call some 3rd party code (say, a shell extension that previews an image type in a file open dialog) that changes the state of some flags (floating point precision, locale, integer overflow flags, division by zero behavior, etc). Your code, which worked fine before, now exhibits completely different behavior.

    Next, many kinds of undefined behavior are inherently non-deterministic. Accessing the contents of a pointer after it is freed (even writing to it) might be safe 99/100, but 1/100 the page was swapped out, or something else was written there before you got to it. Now you have memory corruption. It passes all your tests, but you lacked complete knowledge of what can go wrong.

    By using undefined behavior, you commit yourself to a complete understanding of the C++ standard, everything your compiler can do in that situation, and every way the runtime environment can react. You have to audit the produced assembly, not the C++ source, possibly for the entire program, every time you build it! You also commit everyone who reads that code, or who modifies that code, to that level of knowledge.

    It is sometimes still worth it.

    Fastest Possible Delegates uses UB and knowledge about calling conventions to be a really fast non-owning std::function-like type.

    Impossibly Fast Delegates competes. It is faster in some situations, slower in others, and is compliant with the C++ standard.

    Using the UB might be worth it, for the performance boost. It is rare that you gain something other than performance (speed or memory usage) from such UB hackery.

    Another example I've seen is when we had to register a callback with a poor C API that just took a function pointer. We'd create a function (compiled without optimization), copy it to another page, modify a pointer constant within that function, then mark that page as executable, allowing us to secretly pass a pointer along with the function pointer to the callback.

    An alternative implementation would be to have some fixed size set of functions (10? 100? 1000? 1 million?) all of which look up a std::function in a global array and invoke it. This would put a limit on how many such callbacks we install at any one time, but practically was sufficient.

    这篇关于如果您在多个平台上部署,未定义的行为是否只是一个问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆