UTF-8、UTF-16 和 UTF-32 可以存储的字符数是否不同? [英] Do UTF-8, UTF-16, and UTF-32 differ in the number of characters they can store?

查看:22
本文介绍了UTF-8、UTF-16 和 UTF-32 可以存储的字符数是否不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的.我知道这看起来像典型的为什么他不直接谷歌它或去www.unicode.org 并查一下?" 问题,但对于如此简单的问题,在检查了两个来源后,我仍然无法找到答案.

Okay. I know this looks like the typical "Why didn't he just Google it or go to www.unicode.org and look it up?" question, but for such a simple question the answer still eludes me after checking both sources.

我很确定这三种编码系统都支持所有的 Unicode 字符,但我需要先确认一下,然后才能在演示文稿中声明.

I am pretty sure that all three of these encoding systems support all of the Unicode characters, but I need to confirm it before I make that claim in a presentation.

额外问题:这些编码在可扩展支持的字符数方面是否有所不同?

Bonus question: Do these encodings differ in the number of characters they can be extended to support?

推荐答案

不,它们只是不同的编码方法.它们都支持对同一组字符进行编码.

No, they're simply different encoding methods. They all support encoding the same set of characters.

UTF-8 每个字符使用 1 到 4 个字节,具体取决于您编码的字符.ASCII 范围内的字符仅占用 1 个字节,而非常不寻常的字符占用 4 个字节.

UTF-8 uses anywhere from one to four bytes per character depending on what character you're encoding. Characters within the ASCII range take only one byte while very unusual characters take four.

UTF-32 每个字符使用四个字节,不管它是什么字符,所以它总是比 UTF-8 使用更多的空间来编码相同的字符串.唯一的好处是你可以通过只计算字节数来计算 UTF-32 字符串中的字符数.

UTF-32 uses four bytes per character regardless of what character it is, so it will always use more space than UTF-8 to encode the same string. The only advantage is that you can calculate the number of characters in a UTF-32 string by only counting bytes.

UTF-16 对于大多数字符使用两个字节,对于不常见的字符使用四个字节.

UTF-16 uses two bytes for most charactes, four bytes for unusual ones.

http://en.wikipedia.org/wiki/Comparison_of_Unicode_encodings

这篇关于UTF-8、UTF-16 和 UTF-32 可以存储的字符数是否不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆