Unicode联合会打算使UTF-16用完字符吗? [英] Does the Unicode Consortium Intend to make UTF-16 run out of characters?

查看：103 发布时间：2020/7/13 5:17:09 unicode utf-8 utf-16

本文介绍了Unicode联合会打算使UTF-16用完字符吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当前版本的UTF-16仅能编码1,112,064个不同的数字(代码点)； 0x0-0x10FFFF.

The current version of UTF-16 is only capable of encoding 1,112,064 different numbers(code points); 0x0-0x10FFFF.

Unicode联盟是否打算使UTF-16用完字符?

Does the Unicode Consortium Intend to make UTF-16 run out of characters?

即设置一个代码点> 0x10FFFF

i.e. make a code point > 0x10FFFF

如果没有，为什么有人会为utf-8解析器编写代码，使其能够接受5或6个字节的序列?因为它将在其功能中添加不必要的指令.

If not, why would anyone write the code for a utf-8 parser to be able to accept 5 or 6 byte sequences? Since it would add unnecessary instructions to their function.

1,112,064还不够，我们实际上需要更多字符吗?我的意思是:我们快用完了吗?

Isn't 1,112,064 enough, do we actually need MORE characters? I mean: How quickly are we running out?

为超过860,000个未使用的字符留出空间； CJK扩展名E (约10,000个字符)和另外85个集合就足够了；因此，如果您与 Ferengi文化接触，我们应该做好准备.

leaving room for over 860,000 unused chars; plenty for CJK extension E(~10,000 chars) and 85 more sets just like it; so that in the event of contact with the Ferengi culture, we should be ready.

2003年11月， IETF 限制了UTF-8以U + 10FFFF结尾，且 RFC 3629 ，以匹配UTF-16字符编码的约束:UTF -8解析器不应接受会使utf-16集溢出的5个或6个字节序列，或4个字节序列中大于0x10FFFF

In November 2003 the IETF restricted UTF-8 to end at U+10FFFF with RFC 3629, in order to match the constraints of the UTF-16 character encoding: a UTF-8 parser should not accept 5 or 6 byte sequences that would overflow the utf-16 set, or characters in the 4 byte sequence that are greater than 0x10FFFF

如果它们超出

Please put edits listing sets that pose threats on the size of the unicode code point limit here if they are over 1/3 the Size of the CJK extension E(~10,000 chars):

CJK扩展名E (约10,000个字符)
Ferengi文化人物(约5,000个字符)

CJK extension E(~10,000 chars)
Ferengi culture characters(~5,000 chars)

这篇关于Unicode联合会打算使UTF-16用完字符吗?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Unicode联合会打算使UTF-16用完字符吗? [英] Does the Unicode Consortium Intend to make UTF-16 run out of characters?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Unicode联合会打算使UTF-16用完字符吗? [英] Does the Unicode Consortium Intend to make UTF-16 run out of characters?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭