找到与Lucene匹配的开始和结束 [英] Finding the start and end of a match with Lucene

查看：165 发布时间：2019/1/8 13:36:45 java lucene

本文介绍了找到与Lucene匹配的开始和结束的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想从lucene（版本3.0.2 for Java）查询中找到匹配的开始和结束位置。看起来我应该能够从Highlighter或FastVectorHighligher获取此信息，但这些类似乎只返回一个文本片段，突出显示相关文本。有没有办法通过荧光笔或ScoreDoc本身获取此信息？

I would like to find the start and end positions of a match from a lucene (Version 3.0.2 for Java) query. It seems like I should be able to get this info from Highlighter or FastVectorHighligher, but these classes seem only return a text fragment with the relevant text highlighted. Is there any way to get this info, either with a Highlighter or from the ScoreDoc itself?

更新：我发现了这个相关的问题：
找出Lucene的搜索点击位置

Update: I found this related question: Finding the position of search hits from Lucene

但我认为Allasso的答案对我不起作用，因为我的查询是短语，而不是个别条款。

But I think the answer by Allasso won't work for me because my queries are phrases, not individual terms.

推荐答案

如果我是你，我只需从FastVectorHighlighter中获取代码。相关代码位于FieldTermStack中：

If I were you I'd just take code from FastVectorHighlighter. Relevant code is in FieldTermStack:

        List<string> termSet = fieldQuery.getTermSet(fieldName);
        VectorHighlightMapper tfv = new VectorHighlightMapper(termSet);    
        reader.GetTermFreqVector(docId, fieldName, tfv);  // <-- look at this line

        string[] terms = tfv.GetTerms();
        foreach (String term in terms)
        {
            if (!termSet.Contains(term)) continue;
            int index = tfv.IndexOf(term);
            TermVectorOffsetInfo[] tvois = tfv.GetOffsets(index);
            if (tvois == null) return; // just return to make null snippets
            int[] poss = tfv.GetTermPositions(index);
            if (poss == null) return; // just return to make null snippets
            for (int i = 0; i < tvois.Length; i++)
                termList.AddLast(new TermInfo(term, tvois[i].GetStartOffset(), tvois[i].GetEndOffset(), poss[i]));

主要有reader.GetTermFreqVector（）。就像我说的那样，FastVectorHighlighter已经完成了一些我想复制的工作，但是如果你愿意的话，GetTermPositions调用应该可以做你需要的一切。

The major thing there is reader.GetTermFreqVector(). Like I said, FastVectorHighlighter already does some legwork that I would just copy, but if you want, that GetTermPositions call should do everything you need.

这篇关于找到与Lucene匹配的开始和结束的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

找到与Lucene匹配的开始和结束 [英] Finding the start and end of a match with Lucene

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

找到与Lucene匹配的开始和结束 [英] Finding the start and end of a match with Lucene

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭