Java - 从String中删除奇怪的字符 [英] Java - removing strange characters from a String
问题描述
如何从字符串中删除奇怪且不需要的Unicode字符(例如带问号的黑色菱形)?
How do I remove strange and unwanted Unicode characters (such as a black diamond with question mark) from a String?
更新:
请告诉我对应于带有问号的黑色钻石的Unicode字符串或正则表达式。
Please tell me the Unicode character string or regex that correspond to "a black diamond with question mark in it".
推荐答案
带问号的黑色菱形不是unicode字符 - 它是字体无法显示的占位符。如果字符串中存在的字形不在您用于显示该字符串的字体中,您将看到占位符。这被定义为U + FFFD: 。它的外观取决于你使用的字体。
A black diamond with a question mark is not a unicode character -- it's a placeholder for a character that your font cannot display. If there is a glyph that exists in the string that is not in the font you're using to display that string, you will see the placeholder. This is defined as U+FFFD: �. Its appearance varies depending on the font you're using.
你可以使用 java.text.normalizer
删除不在正常的Unicode字符ASCII字符集。
You can use java.text.normalizer
to remove Unicode characters that are not in the "normal" ASCII character set.
这篇关于Java - 从String中删除奇怪的字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!