为什么 Java 字符使用 UTF-16? [英] Why Java char uses UTF-16?

查看：27 发布时间：2021/12/28 16:58:56 java unicode utf-8 utf-16

本文介绍了为什么 Java 字符使用 UTF-16?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

最近我阅读了很多关于 Unicode 代码点以及它们如何随时间演变的内容，并且确定我阅读了 http://www.joelonsoftware.com/articles/Unicode.html 这也是.

Recently I read lots of things about Unicode code points and how they evolved over time and sure I read http://www.joelonsoftware.com/articles/Unicode.html this also.

但我找不到真正的原因是为什么 Java 使用 UTF-16 作为字符.

But something I couldn't find the real reason for is why Java uses UTF-16 for a char.

例如，如果我有一个包含 1024 个字母的 ASCII 范围字符串的字符串.这意味着 1024 * 2 bytes 等于它会以任何方式消耗的 2KB 字符串内存.

For example, If I had the string which contains 1024 letters of ASCII scoped character string. It means 1024 * 2 bytes which equals 2KB string memory which it will consume in any way.

因此，如果 Java 基本字符是 UTF-8，那么它只有 1KB 的数据.即使字符串中有任何需要 2bytes 的字符，例如 10 个字符的字"，自然会增加内存消耗的大小.(1014 * 1 byte) + (10 * 2 bytes) = 1KB + 20 bytes

So if Java base char would be UTF-8 it would be just 1KB of data. Even if the string has any character which needs to 2bytes for example 10 character of "字" naturally it will increase the size of the memory consumption. (1014 * 1 byte) + (10 * 2 bytes) = 1KB + 20 bytes

结果不是那么明显 1KB + 20 字节 VS.2KB 我不说 ASCII，但我对此的好奇心是为什么它不是 UTF-8，它也只处理多字节字符.在任何包含大量非多字节字符的字符串中，UTF-16 看起来都是一种内存浪费.

The result isn't that obvious 1KB + 20 bytes VS. 2KB I don't say about ASCII but my curiosity about this is why is it not UTF-8 which just takes care of multibyte chars also. UTF-16 looks like a waste of memory in any string which has lots of non-multibyte chars.

这背后有什么好的理由吗?

Is there any good reason behind this?

为什么 Java 字符使用 UTF-16? [英] Why Java char uses UTF-16?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

为什么 Java 字符使用 UTF-16? [英] Why Java char uses UTF-16?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭