lucene standardanalyzer是否会删除停用词并具有词干功能？ [英] does lucene standardanalyzer remove stopwords and have stemming function?

查看：146 发布时间：2019/1/8 12:01:28 java search lucene

本文介绍了lucene standardanalyzer是否会删除停用词并具有词干功能？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经使用indexWriter测试了standardanalyzer并发现它会自动删除停用词，但是，我没有添加停用词列表，因为下面的代码是我使用的

i have tested standardanalyzer with indexWriter and found that it automatically removes stopwords, however, i did not add stopwords list as following code is what i used

StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_35); 
        IndexWriterConfig config =new IndexWriterConfig(Version.LUCENE_35, analyzer);

哪里是默认止损名单？
也，这个分析器是否会自动干掉单词？

where is default stopwords list? also, does this analyzer automatically stem words too??

推荐答案

根据 API docs ，存在一个默认的一组停用词（取自英语），存储在 StandardAnalyzer.STOP_WORDS_SET 中。如果使用构造函数 public StandardAnalyzer（Version matchVersion）创建分析器，则使用它，这正是您所做的。该集与 StopAnalyzer.ENGLISH_STOP_WORDS_SET 。你可以使用其他一个构造函数来传递分析器另一个（可能是空的）一组停用词。

According to the API docs, there exists a default set of stopwords (taken from English language), stored in StandardAnalyzer.STOP_WORDS_SET. It is used if you create the analyzer with the constructor public StandardAnalyzer(Version matchVersion), which is exactly what you do. The set is exactly the same as StopAnalyzer.ENGLISH_STOP_WORDS_SET. You can use one of the other constructors to pass the analyzer another (possibly empty) set of stopwords.

StandardAnalyzer 不会干话。如果您需要词干，请使用例如 SnowballAnalyzer 。

StandardAnalyzer doesn't stem words. If you need stemming, use for example SnowballAnalyzer.

这篇关于lucene standardanalyzer是否会删除停用词并具有词干功能？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

lucene standardanalyzer是否会删除停用词并具有词干功能？ [英] does lucene standardanalyzer remove stopwords and have stemming function?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

lucene standardanalyzer是否会删除停用词并具有词干功能？ [英] does lucene standardanalyzer remove stopwords and have stemming function?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭