UTF-8,UTF-16和UTF-32可以存储的字符数是否有所不同? [英] Do UTF-8, UTF-16, and UTF-32 differ in the number of characters they can store?

查看:179
本文介绍了UTF-8,UTF-16和UTF-32可以存储的字符数是否有所不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的。我知道这看起来像典型的为什么他不只是谷歌搜索它,还是去 www.unicode.org 并查找它? 问题,但是对于这样一个简单的问题,在检查了两个来源之后,答案仍然无法解决。

Okay. I know this looks like the typical "Why didn't he just Google it or go to www.unicode.org and look it up?" question, but for such a simple question the answer still eludes me after checking both sources.

我很确定这三个编码系统都支持所有Unicode字符,但在演示文稿中声明之前,我需要确认一下。

I am pretty sure that all three of these encoding systems support all of the Unicode characters, but I need to confirm it before I make that claim in a presentation.

奖金问题:这些编码在可扩展以支持的字符数方面是否有所不同?

Bonus question: Do these encodings differ in the number of characters they can be extended to support?

推荐答案

不,它们只是不同的编码方法。它们都支持对同一组字符进行编码。

No, they're simply different encoding methods. They all support encoding the same set of characters.

UTF-8每个字符使用1到4个字节,具体取决于您要编码的字符。 ASCII范围内的字符仅占用一个字节,而非常不寻常的字符则占用四个字节。

UTF-8 uses anywhere from one to four bytes per character depending on what character you're encoding. Characters within the ASCII range take only one byte while very unusual characters take four.

UTF-32每个字符使用四个字节,而不管它是什么字符,因此它将始终使用比UTF-8更多的空间来编码相同的字符串。唯一的优点是您可以通过仅计算字节数来计算UTF-32字符串中的字符数。

UTF-32 uses four bytes per character regardless of what character it is, so it will always use more space than UTF-8 to encode the same string. The only advantage is that you can calculate the number of characters in a UTF-32 string by only counting bytes.

UTF-16的大多数字符使用两个字节,四个字节

UTF-16 uses two bytes for most charactes, four bytes for unusual ones.

http:// zh-CN。 wikipedia.org/wiki/Comparison_of_Unicode_encodings

这篇关于UTF-8,UTF-16和UTF-32可以存储的字符数是否有所不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆