忽略检查中的特殊字符 [英] Ignore special characters in Examine
本文介绍了忽略检查中的特殊字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在Umbraco,我使用Examine在网站中进行搜索,但内容为法文.一切工作正常,但当我搜索Français"时,结果与"Francais"不同.有没有办法忽略那些法语字符?我尝试找到Leucene/Examine的FrenchAnalyser,但没有找到任何东西.我使用Fuzzy,因此即使单词不相同,它也会返回结果.
In Umbraco, I use Examine to search in the website but the content is in french. Everything works fine except when I search for "Français" it's not the same result as "Francais". Is there a way to ignore those french characters? I try to find a FrenchAnalyser for Leucene/Examine but did not found anything. I use Fuzzy so it return results even if the words is not the same.
这是我搜索的代码:
public static ISearchResults Search(string searchTerm)
{
var provider = ExamineManager.Instance.SearchProviderCollection["ExternalSearcher"];
var criteria = provider.CreateSearchCriteria(BooleanOperation.Or);
var crawl = criteria.GroupedOr(BoostedSearchableFields, searchTerm.Boost(15))
.Or().GroupedOr(BoostedSearchableFields, searchTerm.Fuzzy(Fuzziness))
.Or().GroupedOr(SearchableFields, searchTerm.Fuzzy(Fuzziness))
.Not().Field("umbracoNavHide", "1");
return provider.Search(crawl.Compile());
}
推荐答案
我们最终使用了基于SnowballAnalyzer
public class CustomAnalyzer : SnowballAnalyzer
{
public CustomAnalyzer() : base("French") { }
public override TokenStream TokenStream(string fieldName, TextReader reader)
{
TokenStream result = base.TokenStream(fieldName, reader);
result = new ISOLatin1AccentFilter(result);
return result;
}
}
这篇关于忽略检查中的特殊字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文