如何从Java中的®,©,™等字符串中删除高位ASCII字符 [英] How to remove high-ASCII characters from string like ®, ©, ™ in Java
本文介绍了如何从Java中的®,©,™等字符串中删除高位ASCII字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想从Java中的String中检测并删除高级ASCII字符,如®,©,™。是否有任何开源库可以做到这一点?
I want to detect and remove high-ASCII characters like ®, ©, ™ from a String in Java. Is there any open-source library that can do this?
推荐答案
如果你需要删除所有非US-ASCII(即在0x0-0x7F之外的字符,您可以这样做:
If you need to remove all non-US-ASCII (i.e. outside 0x0-0x7F) characters, you can do something like this:
s = s.replaceAll("[^\\x00-\\x7f]", "");
如果你需要过滤很多字符串,最好使用预编译模式:
If you need to filter many strings, it would be better to use a precompiled pattern:
private static final Pattern nonASCII = Pattern.compile("[^\\x00-\\x7f]");
...
s = nonASCII.matcher(s).replaceAll();
如果它真的对性能至关重要,也许Alex Nikolaenkov的建议会更好。
And if it's really performance-critical, perhaps Alex Nikolaenkov's suggestion would be better.
这篇关于如何从Java中的®,©,™等字符串中删除高位ASCII字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文