如何获取Unicode字符的代码? [英] How can I get a Unicode character's code?

查看:131
本文介绍了如何获取Unicode字符的代码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有这个:

char registered = '®';

umlaut

推荐答案

只需将其转换为 int

char registered = '®';
int code = (int) registered;

其实有一个从 char int ,所以你不必像我上面所做的那样明确地指定它,但是在这种情况下, do。

In fact there's an implicit conversion from char to int so you don't have to specify it explicitly as I've done above, but I would do so in this case to make it obvious what you're trying to do.

这将给出UTF-16代码单元 - 这与基本多语言平面中定义的任何字符的Unicode代码点相同。 (在Java中,只有BMP字符可以表示为 char 值。)因为Andrzej Doyle的回答说,如果你想要一个任意字符串的Unicode代码点,使用 Character.codePointAt()

This will give the UTF-16 code unit - which is the same as the Unicode code point for any character defined in the Basic Multilingual Plane. (And only BMP characters can be represented as char values in Java.) As Andrzej Doyle's answer says, if you want the Unicode code point from an arbitrary string, use Character.codePointAt().

一旦你有UTF-16代码单位或Unicode代码点,是整数,它取决于你对他们做什么。如果你想要一个字符串表示,你需要确定你想要的表示形式的 kind 。 (例如,如果你知道值将永远在BMP中,你可能想要一个固定的4位数十六进制表示前缀 U + ,例如 U + 0020为空格。)这超出了这个问题的范围,因为我们不知道什么要求。

Once you've got the UTF-16 code unit or Unicode code points, but of which are integers, it's up to you what you do with them. If you want a string representation, you need to decide exactly what kind of representation you want. (For example, if you know the value will always be in the BMP, you might want a fixed 4-digit hex representation prefixed with U+, e.g. "U+0020" for space.) That's beyond the scope of this question though, as we don't know what the requirements are.

这篇关于如何获取Unicode字符的代码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆