如何在字符串文字中添加补充Unicode字符? [英] How to put a supplementary Unicode character in a string literal?
问题描述
如何添加补充Unicode字符(例如,代码点 10400 )在字符串文字中?
我试过像这样放一个代理对:
How to put a supplementary Unicode character (say, codepoint 10400) in a string literal? I have tried putting a surrogate pair like this:
String text = "TEST \uD801\uDC00";
System.out.println(text);
但它似乎不起作用。
更新:
好消息是,字符串构造正确。
UTF-8中的字节数组:54 45 53 54 20 f0 90 90 80
$
UTF-16字节数组:fe ff 0 54 0 45 0 53 0 54 0 20 d8 1 dc 0
The good news is, the string is constructed properly.
Byte array in UTF-8: 54 45 53 54 20 f0 90 90 80
Byte array in UTF-16: fe ff 0 54 0 45 0 53 0 54 0 20 d8 1 dc 0
但坏消息是,它打印不正确(在我的Fedora盒子里),我可以看到一个正方形而不是预期的符号(我的控制台没有正确支持unicode)。
But the bad news is, it is not printed properly (in my Fedora box) and I can see a square instead of the expected symbol (my console didn't support unicode properly).
推荐答案
适合我,究竟是什么问题?
"Works for me", what exactly is the issue?
public static void main (String[] args) throws Exception {
int cp = 0x10400;
String text = "test \uD801\uDC00";
System.out.println("cp: " + cp);
System.out.println("found: " + text.codePointAt(5));
System.out.println("len: " + text.length());
}
输出:
cp: 66560
found: 66560
len: 7
请注意,长度 - 与大多数String方法一样 - 处理 char
s,而不是Unicode字符。非常棒的Unicode支持:)
Note that length -- like most String methods -- deals with char
s, not Unicode characters. So much for awesome Unicode support :)
快乐编码。
这篇关于如何在字符串文字中添加补充Unicode字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!