如何将日文字符分类为汉字或假名? [英] How to classify Japanese characters as either kanji or kana?
本文介绍了如何将日文字符分类为汉字或假名?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
Given the text below, how can I classify each character as kana or kanji?
谁か确认上记これらのフ
誰か確認上記これらのフ
要得到这样的东西
誰 - kanji
か - kana
確 - kanji
認 - kanji
上 - kanji
記 - kanji
こ - kana
れ - kana
ら - kana
の - kana
フ - kana
(对不起,如果我做的话)这是错误的。)
(Sorry if I did it incorrectly.)
推荐答案
此功能内置于 Character.UnicodeBlock 类。与日语相关的Unicode块的一些示例:
This functionality is built into the Character.UnicodeBlock class. Some examples of the Unicode blocks related to the Japanese language:
Character.UnicodeBlock.of('誰') == CJK_UNIFIED_IDEOGRAPHS
Character.UnicodeBlock.of('か') == HIRAGANA
Character.UnicodeBlock.of('フ') == KATAKANA
Character.UnicodeBlock.of('フ') == HALFWIDTH_AND_FULLWIDTH_FORMS
Character.UnicodeBlock.of('!') == HALFWIDTH_AND_FULLWIDTH_FORMS
Character.UnicodeBlock.of('。') == CJK_SYMBOLS_AND_PUNCTUATION
但是,与往常一样,魔鬼在细节中:
But, as always, the devil is in the details:
Character.UnicodeBlock.of('A') == HALFWIDTH_AND_FULLWIDTH_FORMS
其中 A
是全角字符。因此,这与上面的半宽Katakana フ
属于同一类别。请注意,全宽 A
与正常(半角)不同 A
:
where A
is the full-width character. So this is in the same category as the halfwidth Katakana フ
above. Note that the full-width A
is different from the normal (half-width) A
:
Character.UnicodeBlock.of('A') == BASIC_LATIN
这篇关于如何将日文字符分类为汉字或假名?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文