需要对Solr语言干的解释 [英] Need explanation on Language Stemmer of Solr
问题描述
我正在与Solr结合使用螺母来开发阿拉伯文本搜索引擎.我需要在阿拉伯语文本上实现一个词干提取器,在对Solr Stemmer进行搜索时,我发现它提供了这两个过滤器
I'm using nutch with Solr for a developing a search engine for Arabic texts. I need to implement a stemmer on my Arabic texts, and while serching on Solr Stemmer I found that it provide those two filters
<filter class="solr.ArabicNormalizationFilterFactory"/>
<filter class="solr.ArabicStemFilterFactory"/>
我尝试了它们,但不了解它们的作用..因此,请提供任何示例可以帮助我的人吗?
I tried them but did not understand what they do .. So please any one can help me with some examples??
并执行以下两项操作:
العملات塞到عملة
العملات Stemmed to عملة
البسَاتِين,بساتينكم塞到بستان
البسَاتِين ، بساتينكم Stemmed to بستان
谢谢.
推荐答案
You can find some details here: http://lucene.apache.org/core/3_6_0/api/contrib-analyzers/org/apache/lucene/analysis/ar/ArabicStemmer.html
说:
词干定义为:
- 删除所附定冠词,连词和介词.
- 常见后缀的定标.
这篇关于需要对Solr语言干的解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!