Lucene 搜索匹配短语中的任何单词 [英] Lucene search match any word at phrase

查看:53
本文介绍了Lucene 搜索匹配短语中的任何单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想搜索一个包含很多单词的字符串,并检索与其中任何一个匹配的文档.我的索引方法如下:

i wanna search a string with lots of words, and retrieves documents that matches with any of them. My indexing method is the folowing:

 Document document = new Document();
 document.add(new TextField("termos", text, Field.Store.YES));
 document.add(new TextField("docNumber",fileNumber,Field.Store.YES));

 config = new IndexWriterConfig(analyzer);
 Analyzer analyzer = CustomAnalyzer.builder()
            .withTokenizer("standard")
            .addTokenFilter("lowercase")
            .addTokenFilter("stop")
            .addTokenFilter("porterstem")
            .addTokenFilter("capitalization")
            .build();
 config = IndexWriterConfig(analyzer);
 writer = new IndexWriter(indexDirectory, config);
 writer.addDocument(document);
 writer.commit();

这是我的搜索方法.我不想寻找特定的短语,而是其中的任何单词.搜索分析器与索引分析器相同.

And here is my search method. I dont wanna look for specific phrase, but any of word in that. The analyzer for search is the same that for index.

Query query = new QueryBuilder(analyzer).createPhraseQuery("termos","THE_PHRASE");
String indexDir = rootProjectFolder + "/indexDir/";
IndexReader reader = DirectoryReader.open(indexDir);
IndexSearcher searcher = new IndexSearcher(reader);
TopScoreDocCollector collector = TopScoreDocCollector.create(1000,1000);
searcher.search(query,collector);

我是 Lucene 的新手.有人可以帮我吗?

Im new on Lucene. Someone can help me?

推荐答案

Using createPhraseQuery("termos", "list of words") 将精确地尝试将短语list of words"与0 的短语斜率.

Using createPhraseQuery("termos", "list of words") will precisely try to match the phrase "list of words" with a phrase slop of 0.

如果你想匹配一个词列表中的任何术语,你可以使用 createBooleanQuery :

If you want to match any term in a list of words, you can use createBooleanQuery :

new QueryBuilder(analyzer).createBooleanQuery("termos", terms, BooleanClause.Occur.SHOULD);

作为替代,您也可以使用 createMinShouldMatchQuery() 这样您就可以要求匹配查询词数量的一小部分,例如.匹配至少 10% 的术语:

As an alternative, you can also use createMinShouldMatchQuery() so that you can require a fraction of the number of query terms to match, eg. to match at least 10 percent of the terms :

new QueryBuilder(analyzer).createMinShouldMatchQuery("termos", terms, 0.1f));

这篇关于Lucene 搜索匹配短语中的任何单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆