Java字符串搜索忽略重音 [英] Java string searching ignoring accents

查看：168 发布时间：2017/11/8 19:36:58 java string localization filter diacritics

本文介绍了Java字符串搜索忽略重音的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图为我的应用程序编写一个过滤器函数，它将采用一个输入字符串并以某种方式过滤掉与给定输入不匹配的所有对象。最简单的方法是使用String的contains方法，即只检查对象（对象中的String变量）是否包含在过滤器中指定的字符串，但是这不包含重音。

b
$ b

有问题的对象基本上是Persons，而我试图匹配的字符串是名字。因此，例如，如果有人搜索Joao，我希望Joáo被包含在结果集中。我已经在我的应用程序中使用Collator类来按名称进行排序，并且效果很好，因为它可以进行比较，即使用英国语言区域设置在b之前但在a之后。但是，如果比较a和á是不相等的，那么它就不会返回0.

那么有没有人知道我可以怎么做呢？

解决方案

使用 java.text.Normalizer 和一个正则表达式来消除 diacritics 。

  public static String removeDiacriticalMarks（String string）{
 return Normalizer.normalize（string，Form.NFD）
 .replaceAll（\\p {InCombiningDiacriticalMarks} +，）; 
 
 $ / code>

您可以使用如下：

 字符串值=Joáo; 
字符串comparisonMaterial = removeDiacriticalMarks（value）; // Joao

I am trying to write a filter function for my application that will take an input string and filter out all objects that don't match the given input in some way. The easiest way to do this would be to use String's contains method, i.e. just check if the object (the String variable in the object) contains the string specified in the filter, but this won't account for accents.



The objects in question are basically Persons, and the strings I am trying to match are names.  So for example if someone searches for Joao I would expect Joáo to be included in the result set.  I have already used the Collator class in my application to sort by name and it works well because it can do compare, i.e. using the UK Locale á comes before b but after a.  But obvisouly it doesn't return 0 if you compare a and á because they are not equal.

So does anyone have any idea how I might be able to do this?
 解决方案 
Make use of java.text.Normalizer and a shot of regex to get rid of the diacritics.
public static String removeDiacriticalMarks(String string) {
    return Normalizer.normalize(string, Form.NFD)
        .replaceAll("\\p{InCombiningDiacriticalMarks}+", "");
}
Which you can use as follows:
String value = "Joáo";
String comparisonMaterial = removeDiacriticalMarks(value); // Joao


                        
这篇关于Java字符串搜索忽略重音的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

Java字符串搜索忽略重音 [英] Java string searching ignoring accents

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Java字符串搜索忽略重音 [英] Java string searching ignoring accents

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭