为什么Java char使用UTF-16? [英] Why Java char uses UTF-16?

查看:159
本文介绍了为什么Java char使用UTF-16?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我读了很多关于unicode代码点以及它们如何随着时间的推移而发展的事情,并确定我读过 http://www.joelonsoftware.com/articles/Unicode.html 这也是。

Recently I read lots of thing about unicode code points and how they evolved over time and sure I read http://www.joelonsoftware.com/articles/Unicode.html this also.

但我无法找到Java使用的真正原因UTF-16表示字符。

But something I couldn't find the real reason why Java uses UTF-16 for a char.

例如,如果我的字符串包含1024个字母的ASCII作用字符串字符串。这意味着 1024 * 2字节等于它将消耗的2KB字符串内存。

For example If I had the string which contains 1024 letter of ASCII scoped charachter string. It means 1024 * 2 bytes which equals to 2KB string memory it will consume in anyway.

所以如果Java基础char将是UTF-8,它只是1KB的数据。即使字符串有任何需要2字节的字符,例如10字符字,它自然也会增加内存消耗的大小。 (1014 * 1字节)+(10 * 2字节)= 1KB + 20字节

So if Java base char would be UTF-8 it would be just 1KB of data. Even if the string has any charachter which needs to 2bytes for example 10 charachter of "字" naturally it will increase the size of the memory consumption. (1014 * 1 byte) + (10 * 2 bytes) = 1KB + 20 bytes

结果是'n'那明显的 1KB + 20字节VS. 2KB 我不是说ASCII,而是我对此的好奇心,为什么它不是UTF-8,它只是照顾多字节字符。 UTF-16在任何具有大量非多字节字符的字符串中看起来都像是浪费内存。

The result isn't that obvious 1KB + 20 bytes VS. 2KB I don't say about ASCII but my curiosity about this why it is not UTF-8 which is just take cares of multibyte chars also. UTF-16 it looks like waste of memory in any string which has lots of non multibyte chars.

这背后有什么好理由吗?

Is there any good reason behind this ?

推荐答案

一个原因是随机访问或迭代字符串字符的性能特征:

One reason are the performance characteristics of random access or iterating over the characters of a String:

UTF-8编码使用可变数(1-4)字节来编码unicode char。因此,通过索引访问字符: String.charAt(i)实现起来会更复杂,并且比 java.lang使用的数组访问更慢.String

UTF-8 encoding uses a variable number (1-4) bytes to encode a unicode char. Therefore accessing a character by index: String.charAt(i) would be way more complicated to implement and slower than the array access used by java.lang.String.

这篇关于为什么Java char使用UTF-16?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆