std :: wstring的标准定义字节序是什么? [英] What's the standard-defined endianness of std::wstring?

查看:149
本文介绍了std :: wstring的标准定义字节序是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道UTF-16有两种字节序:大字节序和小字节序。

I know the UTF-16 has two types of endiannesses: big endian and little endian.

C ++标准是否定义了std :: wstring的字节序?还是它是实现定义的?

Does the C++ standard define the endianness of std::wstring? or it is implementation-defined?

如果是标准定义的,那么C ++标准的哪一页提供了有关此问题的规则?

If it is standard-defined, which page of the C++ standard provide the rules on this issue?

如果它是实现定义的,如何确定?例如在VC ++下。

If it is implementation-defined, how to determine it? e.g. under VC++. Does the compiler guarantee the endianness of std::wstring is strictly dependent on the processor?

我必须知道这一点;编译器是否保证std :: wstring的字节序严格取决于处理器?因为我想将UTF-16字符串发送给其他人。我必须在UTF-16字符串的开头添加正确的BOM,以表明其字节序。

I have to know this; because I want to send the UTF-16 string to others. I must add the correct BOM in the beginning of the UTF-16 string to indicate its endianness.

总之:给定一个std :: wstring,应该如何我可靠地确定其字节序吗?

推荐答案

字节序取决于机器,而不取决于语言。 Endianess由处理器及其在内存中进出数据的方式定义。处理wchar_t(大于单个字节)时,处理器本身在读取或写入时会根据需要对齐多个字节,以便再次将其读回或写入RAM。代码只是将其视为处理器内部寄存器中表示的16位(或更大)字。

Endianess is MACHINE dependent, not language dependent. Endianess is defined by the processor and how it arranges data in and out of memory. When dealing with wchar_t (which is wider than a single byte), the processor itself upon a read or write aligns the multiple bytes as it needs to in order to read or write it back to RAM again. Code simply looks at it as the 16 bit (or larger) word as represented in a processor internal register.

用于确定(如果这确实是您想要执行的操作) endianess(您自己),您可以尝试将KNOWN 32位(无符号int)值写出到ram,然后使用char指针将其读回。查找返回的顺序。

For determining (if that is really what you want to do) endianess (on your own), you could try writing a KNOWN 32 bit (unsigned int) value out to ram, then read it back using a char pointer. Look for the ordering that is returned.

它看起来像这样:

unsigned int aVal = 0x11223344;
char * myValReadBack = (char *)(&aVal);

if(*myValReadBack == 0x11) printf("Big endian\r\n");
else                       printf("Little endian\r\n");

我敢肯定还有其他方法,但是上面的方法应该可行,不过请检查我的大小:-)

Im sure there are other ways, but something like the above should work, check my little versus big though :-)

此外,直到Windows RT,VC ++才真正编译为intel类型的处理器。他们真的只有1种耐力类型。

Further, until Windows RT, VC++ really only compiled to intel type processors. They really only have had 1 endianess type.

这篇关于std :: wstring的标准定义字节序是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆