使用Lucene荧光笔的问题 [英] Problems using Lucene Highlighter

查看:110
本文介绍了使用Lucene荧光笔的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在使用Lucene Highlighter 2.4.1作为我的应用程序。我使用荧光笔获得最佳的匹配片段,并显示它们。
我调用一个函数String [] getFragmentsWithHighlightedTerms(Analyzer analyzer,Query query,String fieldName,String fieldContents,int fragmentsNumber,int fragmentSize)。例如:

  String text = doc.get(MetaData); 
getFragmentsWithHighlightedTerms(analyzer,query,MetaData,Text,5,100);

函数getFragmentsWithHighlightedTerms()定义如下



$
{
TokenStream stream = TokenSources.getTokenStream(fieldName,fieldContents,analyzer);
SpanScorer scorer = new SpanScorer(query,fieldName,new CachingTokenFilter(stream));
Fragmenter fragmenter = new SimpleSpanFragmenter(scorer,fragmentSize);

荧光笔荧光笔=新的荧光笔(得分手);
highlighter.setTextFragmenter(fragmenter);
highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);

String [] fragments = highlighter.getBestFragments(stream,fieldContents,fragmentNumber);

返回片段;
}

现在我的麻烦是,highlighter.getBestFragments()方法返回重复。即,如果我显示前5个片段,否则。 1和3相同。我不太明白是什么原因造成的。代码有问题吗?

解决方案

我没有代码在我面前,但我认为你正在阵列阵列
所以你需要这样做:

  item [] = fragments [0] 
fragment =项目[0]

或者只是得到1个项目的片段数组。


I am using Lucene Highlighter 2.4.1 for my application. I use the highlighter to get the best matching fragments, and display them. I make a call to a function String[] getFragmentsWithHighlightedTerms(Analyzer analyzer, Query query, String fieldName, String fieldContents, int fragmentsNumber, int fragmentSize). For example :

String text = doc.get("MetaData");
getFragmentsWithHighlightedTerms(analyzer, query, "MetaData", Text, 5, 100);

The function getFragmentsWithHighlightedTerms() is defined as follows

private static String[] getFragmentsWithHighlightedTerms( argument list here)
{
    TokenStream stream = TokenSources.getTokenStream(fieldName, fieldContents, analyzer);
    SpanScorer scorer = new SpanScorer(query, fieldName, new CachingTokenFilter(stream));
    Fragmenter fragmenter = new SimpleSpanFragmenter(scorer, fragmentSize);

    Highlighter highlighter = new Highlighter(scorer);
    highlighter.setTextFragmenter(fragmenter);
    highlighter.setMaxDocCharsToAnalyze(Integer.MAX_VALUE);

    String[] fragments = highlighter.getBestFragments(stream, fieldContents, fragmentNumber);

    return fragments;
}

Now my trouble is that the highlighter.getBestFragments() method is returning duplicates. i.e, If i display say the first 5 fragments, no. 1 and 3 are same. I do not quite understand what is causing this. Is there a problem with the code?

解决方案

I dont have the code in front of me, but I think you are getting an array of arrays. So you would need to do this:

item[] = fragments[0]
fragment = item[0]

or just get 1 item out the fragments array.

这篇关于使用Lucene荧光笔的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆