从 Lucene 查询中获取匹配的术语 [英] Get matched terms from Lucene query
问题描述
给定一个 Lucene 搜索查询,例如:+(letter:A letter:B letter:C) +(style:Capital)
,我如何判断三个字母中的哪一个实际上与任何给定文档匹配?我不在乎它们在哪里匹配,或者它们匹配了多少次,我只需要知道它们是否匹配.
Given a Lucene search query like: +(letter:A letter:B letter:C) +(style:Capital)
, how can I tell which of the three letters actually matched any given document? I don't care where they match, or how many times they match, I just need to know whether they matched.
目的是获取初始查询(A B C"),删除成功匹配的术语(A和B),然后对剩余的(C)进行进一步处理.
The intent is to take the initial query ("A B C"), remove the terms which successfully matched (A and B), and then do further processing on the remainder (C).
推荐答案
虽然示例是用 c# 编写的,但 Lucene API 非常相似(一些大小写不同).我认为翻译成java并不难.
Although the sample is in c#, Lucene APIs are very similar(some upper/lower case differences). I don't think it would be hard to translate to java.
这是用法
List<Term> terms = new List<Term>(); //will be filled with non-matched terms
List<Term> hitTerms = new List<Term>(); //will be filled with matched terms
GetHitTerms(query, searcher,docId, hitTerms,terms);
这是方法
void GetHitTerms(Query query,IndexSearcher searcher,int docId,List<Term> hitTerms,List<Term>rest)
{
if (query is TermQuery)
{
if (searcher.Explain(query, docId).IsMatch() == true)
hitTerms.Add((query as TermQuery).GetTerm());
else
rest.Add((query as TermQuery).GetTerm());
return;
}
if (query is BooleanQuery)
{
BooleanClause[] clauses = (query as BooleanQuery).GetClauses();
if (clauses == null) return;
foreach (BooleanClause bc in clauses)
{
GetHitTerms(bc.GetQuery(), searcher, docId,hitTerms,rest);
}
return;
}
if (query is MultiTermQuery)
{
if (!(query is FuzzyQuery)) //FuzzQuery doesn't support SetRewriteMethod
(query as MultiTermQuery).SetRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE);
GetHitTerms(query.Rewrite(searcher.GetIndexReader()), searcher, docId,hitTerms,rest);
}
}
这篇关于从 Lucene 查询中获取匹配的术语的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!