为什么将Punning类型视为UB? [英] Why is type punning considered UB?

查看:47
本文介绍了为什么将Punning类型视为UB?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想象一下:

uint64_t x = *(uint64_t *)((unsigned char[8]){'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'});

我有阅读,这些类型的双关语是未定义的行为.为什么?我实际上是将8个字节的字节重新解释为8个字节的整数.我看不到与 union 的区别,除了类型pun是未定义的行为而 union 的不是?我亲自问了一位程序员,他们说,如果您正在这样做,或者您知道自己在做什么非常好,或者您犯了一个错误.但是社区人士表示,应该始终避免这种做法吗?为什么?

I have read that type puns like that are undefined behavior. Why? I am literally, reinterpreting 8 bytes of bytes into an 8 byte integer. I don't see how that's different from a union except the type pun being undefined behavior and unions not being? I asked a fellow programmer in person and they said that if you're doing it, either you know what you're doing very well, or you're making a mistake. But the community says that this practice should ALWAYS be avoided? Why?

推荐答案

最终,为什么是因为语言规范这么说".您不必对此争论.如果这就是语言的方式,那就是它的方式.

Ultimately the why is "because the language specification says so". You don't get to argue with that. If that's the way the language is, it's the way it is.

如果您想知道动机的用法,那就是原始的C语言缺少任何表达两个左值不能互为别名的方式(而现代语言的 restrict 关键字).无法假设两个左值不能成为别名,意味着编译器无法重新排序加载和存储,并且必须为每次对对象的访问实际执行从存储器到存储器的加载和存储,而不是将值保留在寄存器中,除非它知道寄存器的值.对象的地址从未被占用.

If you want to know the motivation for making it that way, it's that the original C language lacked any way of expressing that two lvalues can't alias one another (and the modern language's restrict keyword is still barely understood by most users of the language). Being unable to assume two lvalues can't alias means the compiler can't reorder loads and stores, and must actually perform loads and stores from/to memory for every access to an object, rather than keeping values in registers, unless it knows the object's address has never been taken.

C的基于基于类型的别名规则在某种程度上缓解了这种情况,方法是让编译器假定具有不同类型的左值没有别名.

C's type-based aliasing rules somewhat mitigate this situation, by letting the compiler assume lvalues with different types don't alias.

还请注意,在您的示例中,不仅存在类型拼写问题,而且还存在未对齐问题. unsigned char 数组没有固有的对齐方式,因此,与任何别名规则无关,在该地址访问 uint64_t 将会是对齐错误(由于其他原因,UB).

Note also that in your example, there's not only type-punning but misalignment. The unsigned char array has no inherent alignment, so accessing a uint64_t at that address would be an alignment error (UB for another reason) independent of any aliasing rules.

这篇关于为什么将Punning类型视为UB?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆