是“char16_t”和“char32_t”misnomers? [英] Are `char16_t` and `char32_t` misnomers?

查看:387
本文介绍了是“char16_t”和“char32_t”misnomers?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意:我相信有人会称之为主观,但我认为它是相当具体的。

NB: I'm sure someone will call this subjective, but I reckon it's fairly tangible.

C ++ 11给了我们新的 basic_string 类型 std :: u16string std :: u32string ,键入 std :: basic_string< char16_t> std :: basic_string< char32_t>

C++11 gives us new basic_string types std::u16string and std::u32string, type aliases for std::basic_string<char16_t> and std::basic_string<char32_t>, respectively.

使用子字符串u16 u32在这个上下文中,意味着UTF-16和UTF-32,这将是愚蠢的,因为C ++当然没有文本编码的概念。

The use of the substrings "u16" and "u32" to me in this context rather implies "UTF-16" and "UTF-32", which would be silly since C++ of course has no concept of text encodings.

事实上,这些名称反映了 char16_t char32_t 的字符类型, 。 未签署,因为其底层类型无符号:

The names in fact reflect the character types char16_t and char32_t, but these seem misnamed. They are unsigned, due to the unsignedness of their underlying types:


[C ++ 11:3.9.1 / 5]: [..]类型 char16_t char32_t 分别表示具有相同大小,正负号和对齐方式的不同类型 uint_least16_t uint_least32_t

[C++11: 3.9.1/5]: [..] Types char16_t and char32_t denote distinct types with the same size, signedness, and alignment as uint_least16_t and uint_least32_t, respectively [..]

但是在我看来,这些名称违反了这样的约定,即这样的无符号类型的名称开始'u ',并且使用 16 的数字不符合最少宽度类型。

But then it seems to me that these names violate the convention that such unsigned types have names beginning 'u', and that the use of numbers like 16 unqualified by terms like least indicate fixed-width types.

我的问题是,这是:我想象的东西,还是这些名称根本上有缺陷?

My question, then, is this: am I imagining things, or are these names fundamentally flawed?

推荐答案

您引用的命名约定( uint32_t int_fast32_t 等)实际上只用于 typedefs ,而不是原始类型。原始整数类型是{signed,unsigned} {char,short,int,long,long long},{与float或decimal类型相反} ..​​.

The naming convention to which you refer (uint32_t, int_fast32_t, etc.) is actually only used for typedefs, and not for primitive types. The primitive integer types are {signed, unsigned} {char, short, int, long, long long}, {as opposed to float or decimal types} ...

但是,除了这些整数类型,还有四种不同,唯一的基本类型, char wchar_t char16_t char32_t ,它们是相应文字的类型 '' L'' u'' U,用于字母数字类型数据,类似地,它们的数组。这些类型当然也是整数类型,因此它们在算术整数类型的某些上具有相同的布局,但是该语言在前面的算术类型之间做了非常清楚的区分

However, in addition to those integer types, there are four distinct, unique, fundamental types, char, wchar_t, char16_t and char32_t, which are the types of the respective literals '', L'', u'' and U'' and are used for alpha-numeric type data, and similarly for arrays of those. Those types are of course also integer types, and thus they will have the same layout at some of the arithmetic integer types, but the language makes a very clear distinction between the former, arithmetic types (which you would use for computations) and the latter "character" types which form the basic unit of some type of I/O data.

(我之前已经讨论过这些新的类型此处这里。)

(I've previously rambled about those new types here and here.)

因此,我认为 char16_t char32_t 实际上非常恰当地命名,以反映它们属于整数类型的char族。

So, I think that char16_t and char32_t are actually very aptly named to reflect the fact that they belong to the "char" family of integer types.

这篇关于是“char16_t”和“char32_t”misnomers?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆