UTF-32和UCS-4有什么区别? [英] What is the difference between UTF-32 and UCS-4?
问题描述
UTF-32和UCS-4有什么区别?
应该不是UTF-32是固定宽度的编码吗?
What is the difference between UTF-32 and UCS-4 ? Isn't UTF-32 supposed to be a fixed-width encoding ?
推荐答案
UTF-32
作为 UCS-4
的子集而启动。除了UTF-32标准还具有其他Unicode语义外,现在完全相同。查看有关
UTF-32
has started as a subset of UCS-4
. Now it is identical except that the UTF-32 standard has additional Unicode semantics. See details on wikipedia:
原始的 ISO 10646标准定义了一种称为
UCS-4 的31位编码形式,其中通用字符集
中的每个编码字符(UCS)由代码空间
中介于0到十六进制7FFFFFFF之间的32位友好代码值表示。
The original ISO 10646 standard defines a 31-bit encoding form called UCS-4, in which each encoded character in the Universal Character Set (UCS) is represented by a 32-bit friendly code value in the code space of integers between 0 and hexadecimal 7FFFFFFF.
因为实际上只有17个平面在使用中,所有当前代码点
在 0 和 0x10FFFF 之间。 UTF-32是UCS-4的子集,该范围仅使用
。由于JTC1 / SC2 / WG2的
原则和程序文件规定,将来所有字符分配都将
限制在BMP或前14个辅助飞机上,因此UTF-32
将能够代表所有Unicode字符。因此, UCS-4
和UTF-32现在相同,除了UTF-32标准具有
个额外的Unicode语义。
Because only 17 planes are actually in use, all current code points are between 0 and 0x10FFFF. UTF-32 is a subset of UCS-4 that uses only this range. Since the Principles and Procedures document of JTC1/SC2/WG2 states that all future assignments of characters will be constrained to the BMP or the first 14 supplementary planes, UTF-32 will be able to represent all Unicode characters. Accordingly, UCS-4 and UTF-32 are now identical except that the UTF-32 standard has additional Unicode semantics.
但是,我不确定,其他Unicode语义
是什么意思。也许有人可以提供更好的答案。
However, I am not exactly sure, what additional Unicode semantics
means. Maybe someone can provide a better answer.
这篇关于UTF-32和UCS-4有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!