Java Unicode编码 [英] Java Unicode encoding

查看:124
本文介绍了Java Unicode编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Java char 2个字节(最大大小为65,536),但有 95,221 Unicode字符。这是否意味着您不能处理Java应用程序中的某些Unicode字符?

A Java char is 2 bytes (max size of 65,536) but there are 95,221 Unicode characters. Does this mean that you can't handle certain Unicode characters in a Java application?

这是否归结为你使用什么字符编码?

Does this boil down to what character encoding you are using?

推荐答案

Java的 char 是一个 UTF-16代码单元。对于代码点> 0xFFFF的字符,将编码为2 char s(代理对)。

Java's char is a UTF-16 code unit. For characters with code-point > 0xFFFF it will be encoded with 2 chars (a surrogate pair).

请参见 http://www.oracle.com/us/technologies/java /supplementary-142654.html ,了解如何处理Java中的这些字符。

See http://www.oracle.com/us/technologies/java/supplementary-142654.html for how to handle those characters in Java.

(BTW,在Unicode 5.2中,有1,114,112个插槽中有107,154个字符。 )

(BTW, in Unicode 5.2 there are 107,154 assigned characters out of 1,114,112 slots.)

这篇关于Java Unicode编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆