有没有人知道如何使用带有Lucene 3.6的Wordnet扩展查询? [英] Does any one know how to expand queries using Wordnet with Lucene 3.6?
问题描述
我在org.apache.lucene.analysis.synonym中找到了 WordnetSynonymParser 这个类,但是没有在API和谷歌中使用它的例子。有没有人有经验呢?
I've found the class WordnetSynonymParser in org.apache.lucene.analysis.synonym but there aren't examples of its usage neither in the API nor in google. Does any one have experience with it?
谢谢!
编辑:我知道曾经有过 SynExpand 课程,但是版本3.6它消失了......
EDIT: I know that there used to be the class SynExpand, but with version 3.6 it disappeared...
我试试:
try {
FileReader rulesReader = new FileReader("wn/wn_s.pl");
SynonymMap.Builder parser = null;
parser = new WordnetSynonymParser(true, true, analyzer);
((WordnetSynonymParser)parser).add(rulesReader);
synonymMap = parser.build();
} catch (Exception e) {
e.printStackTrace();
System.exit(1);
}
但我收到以下错误:
java.text.ParseException: Invalid synonym rule at line 109
at org.apache.lucene.analysis.synonym.WordnetSynonymParser.add(WordnetSynonymParser.java:75)
at pirServer.QueryClassifier.<init>(QueryClassifier.java:77)
at pirServer.PIRServer.main(PIRServer.java:32)
Caused by: java.lang.IllegalArgumentException: term: course of action analyzed to a token with posinc != 1
at org.apache.lucene.analysis.synonym.SynonymMap$Builder.analyze(SynonymMap.java:131)
at org.apache.lucene.analysis.synonym.WordnetSynonymParser.parseSynonym(WordnetSynonymParser.java:92)
at org.apache.lucene.analysis.synonym.WordnetSynonymParser.add(WordnetSynonymParser.java:67)
... 2 more
推荐答案
我正在研究类似的事情而且只是阅读文献 - 所以来自SynonymFilter doc的相关警告非常新鲜:
I am working on a similar thing and just read the documentation - so a relevant caution from the SynonymFilter doc is very fresh:
这个令牌流无法正确处理位置增量!= 1,即你应该放置这个过滤器在过滤掉停用词之前
""This token stream cannot properly handle position increments != 1, ie, you should place this filter before filtering out stop words""
http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/analysis/synonym/SynonymFilter.html
你传递的分析器(你在帖子中没有描述)到WordNetSynonymParser可能会删除停用词(就像大多数人一样) )导致:
It's possible that the analyzer you're passing (which you fail to describe in your post) to the WordNetSynonymParser does remove stop words (as is the case for most of them) causing:
java.lang.IllegalArgumentException:term:分析为posinc的令牌的行为方式!= 1
java.lang.IllegalArgumentException: term: course of action analyzed to a token with posinc != 1
这篇关于有没有人知道如何使用带有Lucene 3.6的Wordnet扩展查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!