Lucene 有效载荷评分 [英] Lucene payload scoring

查看:28
本文介绍了Lucene 有效载荷评分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想弄清楚有效载荷评分在 lucene 中的工作原理.由于我不明白 PayloadFunction 适合哪里,我想我并不真正了解它是如何工作的.尝试用谷歌搜索它,但除了建议通过源代码之外找不到太多.好吧,如果有人可以在这里解释它会很好,否则它是源代码:)

I want to figure out how payload scoring works in lucene. Since I don't understand where PayloadFunction fits in, I think I don't really understand how it works. Tried googling for it, but couldn't find much apart from advice to go through source. Well, it would be nice if someone can explain it here, else source code it is :)

推荐答案

一共有三个部分.首先,您应该在分析期间生成有效载荷.这可以使用 PayloadAttribute 来完成.您只需在分析期间将此属性添加到您想要的术语中.

There are three parts of it. First of all you should generate payloads during analysis. This could be done using PayloadAttribute. You just need to add this attribute to terms you want during analysis.

class MyFilter extends TokenFilter {

  private PayloadAttribute attr;

  public MyFilter() {
    attr = addAttribute(PayloadAttribute.class);
  }

  public final boolean incrementToken() throws IOException {
    if (input.incrementToken()) {
      Payload p = new Payload(PayloadHelper.encodeFloat(42));
      attr.setPayload(p);
    } else {
      attr.setPayload(null);
    }
}

那么在搜索过程中你应该使用特殊的查询类PayloadTermQuery.此类的行为类似于 SpanTermQuery,但会跟踪索引中的有效负载.使用自定义 Similarity 实现,您可以对文档中出现的每个有效负载进行评分.

Then during searching you should use special query class PayloadTermQuery. This class behaves as SpanTermQuery but do track of payloads in index. Using custom Similarity implementation you could score each payload occurrence in document.

public class MySimilarity extends DefaultSimilarity {

  public float scorePayload(int docID, String fieldName,
                            int start, int end, byte[] payload,
                            int offset, int length) {
    if (payload != null) {
      return PayloadHelper.decodeFloat(payload, offset);
    } else {
      return 1.0f;
    }
  }
}

最后,使用 PayloadFunction 您可以汇总文档上的有效负载分数以生成最终文档分数.

Finally, using PayloadFunction you could aggregate payload scores over document to produce final document score.

这篇关于Lucene 有效载荷评分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆