为什么从char转换为std :: byte可能是未定义的行为? [英] Why is casting from char to std::byte potentially undefined behavior?

查看:476
本文介绍了为什么从char转换为std :: byte可能是未定义的行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

C ++ 17的std::byte必须是枚举类:

The std::byte of C++17 is required to be enum class:

enum class byte : unsigned char {};

我们可能想使用那个std::byte来代表原始内存,而不是char之一,因为它更加类型安全,已定义了其字节特定的运算符,并且不能提升为intchar一样呈蓝色.我们需要使用显式强制转换或to_integerstd::byte转换为其他整数.但是,从许多来源我们仍然会得到char(或更可能是char的整个缓冲区),因此可能要转换它:

We may want to use that std::byte to represent raw memory instead of one of chars since it is more type-safe, has its byte-specific operators defined and can't promote to int out of blue like chars do. We need to use explicit casts or to_integer to convert std::byte to other integers. However from lot of sources we still get char (or more likely whole buffers of char) and so may want to convert it:

void fn(char c)
{
    std::byte b = static_cast<std::byte>(c);
    // ... that may invoke undefined behavior, read below
}

char的符号是实现定义的,因此std::numeric_limits<char>::is_signed可能是true.因此,c以上的值可能具有unsigned char范围之外的负值.

The signedness of char is implementation-defined so std::numeric_limits<char>::is_signed may be true. Therefore above c may have negative values that are outside of range of unsigned char.

现在在8.2.9静态强制转换[expr.static.cast]第10段的C ++ 17标准中 我们可以读到:

Now in C++17 standard in 8.2.9 Static cast [expr.static.cast] paragraph 10 we can read that:

整数或枚举类型的值可以显式转换为 完整的枚举类型.如果原始 值在枚举值(10.2)的范围内.否则, 行为是不确定的.

A value of integral or enumeration type can be explicitly converted to a complete enumeration type. The value is unchanged if the original value is within the range of the enumeration values (10.2). Otherwise, the behavior is undefined.

从10.2开始,我们可以看到提到的范围是基础类型的范围.因此,为了避免未定义的行为,我们必须编写更多的代码.例如,我们可以在unsigned char上添加强制类型转换,以在强制类型转换期间实现模块化算术的定义效果:

And from 10.2 we can see that the mentioned range is range of underlying type. Therefore to avoid undefined behavior we have to write more code. For example we can add a cast to unsigned char to achieve defined effects of modular arithmetic during cast:

void fn(char c)
{
    std::byte b = static_cast<std::byte>(static_cast<unsigned char>(c));
    // ... now we have done it in portable manner?
}

我误会了吗?这不是过于复杂和限制吗?为什么具有无符号基础类型的enum class不能像其基础类型那样遵循模块化算法?请注意,无论如何,整个转换行很可能都被编译器编译为空.自C ++ 14起,char进行签名时必须为二进制补码,因此其按位表示形式必须与对unsigned char进行模算术转换后的形式相同.谁从这种形式未定义的行为中受益,又如何受益?

Did I misunderstand something? Isn't that over-abundantly complicated and restrictive? Why can't the enum class that has unsigned underlying type follow modular arithmetic like its underlying type does? Note that the whole row of casts is most likely compiled into nothing by compiler anyway. The char when it is signed has to be two's complement since C++14 and so its bitwise representation has to be same as after modular arithmetic conversion to unsigned char. Who benefits from that formal undefined behavior and how?

推荐答案

此问题将在

整数或枚举类型的值可以显式转换为完整的枚举类型.如果枚举类型具有固定的基础类型,则如果有必要,首先通过整数转换将该值转换为该类型,然后将该值转换为枚举类型.如果枚举类型没有固定的基础类型,则如果原始值在枚举值([dcl.enum])的范围内,则该值保持不变,否则行为未定义

A value of integral or enumeration type can be explicitly converted to a complete enumeration type. If the enumeration type has a fixed underlying type, the value is first converted to that type by integral conversion, if necessary, and then to the enumeration type. If the enumeration type does not have a fixed underlying type, the value is unchanged if the original value is within the range of the enumeration values ([dcl.enum]), and otherwise, the behavior is undefined

此处是更改背后的原理从(C ++ 11)未指定到(C ++ 17)未定义: 

Here's the rationale behind the change from (C++11) unspecified to (C++17) undefined: 

尽管问题1094阐明了枚举类型的表达式的值可能不在转换为枚举类型后的枚举值的范围内(请参阅8.2.9 [expr.static.cast]第10段) ,结果只是一个未指定的值.鉴于未定义的行为会使表达式变得非恒定,这一点可能应该得到加强,以产生未定义的行为.

这是 C ++ 2a修复的原理:

And here's the rationale behind the C++2a fix:

std :: byte(21.2.5 [support.types.byteops])和bitmask(20.4.2.1.4 [bitmask.types])的规范揭示了整数转换规则的问题,根据这两个规则在一般情况下,这些规范具有未定义的行为.问题在于,除非要转换的值在枚举范围内,否则转换为枚举类型的行为将不确定.

The specifications of std::byte (21.2.5 [support.types.byteops]) and bitmask (20.4.2.1.4 [bitmask.types]) have revealed a problem with the integral conversion rules, according to which both those specifications have, in the general case, undefined behavior. The problem is that a conversion to an enumeration type has undefined behavior unless the value to be converted is in the range of the enumeration.

对于具有无符号固定基础类型的枚举,此要求过于严格,因为将大值转换为无符号整数类型是明确定义的.

For enumerations with an unsigned fixed underlying type, this requirement is overly restrictive, since converting a large value to an unsigned integer type is well-defined.

这篇关于为什么从char转换为std :: byte可能是未定义的行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆