Big Endian 和 Little Endian 字节顺序的区别 [英] Difference between Big Endian and little Endian Byte order
问题描述
Big Endian 和 Little Endian Byte 顺序有什么区别?
What is the difference between Big Endian and Little Endian Byte order ?
这两个似乎都与Unicode和UTF16有关.我们究竟在哪里使用它?
Both of these seem to be related to Unicode and UTF16. Where exactly do we use this?
推荐答案
Big-Endian (BE)/Little-Endian (LE) 是组织多字节单词的两种方式.例如在UTF-16中用两个字节表示一个字符时,有两种方法可以将字符0x1234
表示为一串字节(0x00-0xFF):
Big-Endian (BE) / Little-Endian (LE) are two ways to organize multi-byte words. For example, when using two bytes to represent a character in UTF-16, there are two ways to represent the character 0x1234
as a string of bytes (0x00-0xFF):
Byte Index: 0 1
---------------------
Big-Endian: 12 34
Little-Endian: 34 12
为了确定文本是使用 UTF-16BE 还是 UTF-16LE,规范建议在字符串前添加一个字节顺序标记 (BOM),代表字符 U+FEFF.因此,如果 UTF-16 编码的文本文件的前两个字节是 FE
、FF
,则编码为 UTF-16BE.对于FF
、FE
,是UTF-16LE.
In order to decide if a text uses UTF-16BE or UTF-16LE, the specification recommends to prepend a Byte Order Mark (BOM) to the string, representing the character U+FEFF. So, if the first two bytes of a UTF-16 encoded text file are FE
, FF
, the encoding is UTF-16BE. For FF
, FE
, it is UTF-16LE.
一个直观的例子:不同编码中的单词Example"(带有 BOM 的 UTF-16):
A visual example: The word "Example" in different encodings (UTF-16 with BOM):
Byte Index: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
------------------------------------------------------------
ASCII: 45 78 61 6d 70 6c 65
UTF-16BE: FE FF 00 45 00 78 00 61 00 6d 00 70 00 6c 00 65
UTF-16LE: FF FE 45 00 78 00 61 00 6d 00 70 00 6c 00 65 00
有关更多信息,请阅读 Endianness 和/或 UTF-16.
For further information, please read the Wikipedia page of Endianness and/or UTF-16.
这篇关于Big Endian 和 Little Endian 字节顺序的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!