为什么UTF-8使用多个字节表示某些字符? [英] Why does UTF-8 use more than one byte to represent some characters?

查看：110 发布时间：2020/7/13 4:10:25 utf-8 character-encoding

本文介绍了为什么UTF-8使用多个字节表示某些字符?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我最近浏览了有关字符编码的文章 .我对此处提到的某点感到担忧.

I recently went through an article on Character Encoding. I've a concern on a certain point mentioned there.

在第一张图中，作者显示了字符，它们在各种字符集中的代码点以及如何以各种编码格式进行编码. 例如，é的代码点是E9. 在ISO-8859-1编码中，它表示为E9. 在UTF-16中，它表示为00 E9. 但是在UTF-8中，它用2个字节C3 A9表示.

In the first figure, the author shows the characters, their code points in various character sets and how they are encoded in various encoding formats. For example the code point of é is E9. In ISO-8859-1 encoding it is represented as E9. In UTF-16 it is represented as 00 E9. But in UTF-8 it is represented using 2 bytes, C3 A9.

我的问题是为什么要这样做?可以用1个字节表示.为什么要使用两个字节?你能告诉我吗?

My question is why is this required? It can be represented with 1 byte. Why are two bytes used? Can you please let me know?

为什么UTF-8使用多个字节表示某些字符? [英] Why does UTF-8 use more than one byte to represent some characters?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

为什么UTF-8使用多个字节表示某些字符? [英] Why does UTF-8 use more than one byte to represent some characters?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭