Java字符串搜索忽略重音 [英] Java string searching ignoring accents
问题描述
$ b
有问题的对象基本上是Persons,而我试图匹配的字符串是名字。因此,例如,如果有人搜索Joao,我希望Joáo被包含在结果集中。我已经在我的应用程序中使用Collator类来按名称进行排序,并且效果很好,因为它可以进行比较,即使用英国语言区域设置在b之前但在a之后。但是,如果比较a和á是不相等的,那么它就不会返回0.
那么有没有人知道我可以怎么做呢?
使用 java.text.Normalizer
和一个正则表达式来消除 diacritics 。
public static String removeDiacriticalMarks(String string){
return Normalizer.normalize(string,Form.NFD)
.replaceAll(\\p {InCombiningDiacriticalMarks} +,);
$ / code>
您可以使用如下:
字符串值=Joáo;
字符串comparisonMaterial = removeDiacriticalMarks(value); // Joao
I am trying to write a filter function for my application that will take an input string and filter out all objects that don't match the given input in some way. The easiest way to do this would be to use String's contains method, i.e. just check if the object (the String variable in the object) contains the string specified in the filter, but this won't account for accents.
The objects in question are basically Persons, and the strings I am trying to match are names. So for example if someone searches for Joao I would expect Joáo to be included in the result set. I have already used the Collator class in my application to sort by name and it works well because it can do compare, i.e. using the UK Locale á comes before b but after a. But obvisouly it doesn't return 0 if you compare a and á because they are not equal.
So does anyone have any idea how I might be able to do this?
Make use of java.text.Normalizer
and a shot of regex to get rid of the diacritics.
public static String removeDiacriticalMarks(String string) {
return Normalizer.normalize(string, Form.NFD)
.replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}
Which you can use as follows:
String value = "Joáo";
String comparisonMaterial = removeDiacriticalMarks(value); // Joao
这篇关于Java字符串搜索忽略重音的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!