在 Solr/Lucene 中删除低于某个分数阈值的结果? [英] Remove results below a certain score threshold in Solr/Lucene?
问题描述
solr/lucene 是否有内置功能来过滤低于某个分数阈值的结果?假设我提供的分数阈值为 0.2,那么所有分数小于 0.2 的文档都将从我的结果中删除.我的直觉是,这可以通过更新/自定义 solr 或 lucene 来实现.
Is there a built-in functionalities in solr/lucene to filter the results if they fall below a certain score threshold? Let's say if I provide a score threshold of .2, then all documents with score less than .2 will be removed from my results. My intuition is that this is possible by updating/customizing solr or lucene.
你能指点我正确的方向吗?
Could you point me to right direction on how to do this?
提前致谢!
推荐答案
您可以编写自己的收集器,该收集器将忽略收集评分员置于阈值以下的那些文档.下面是一个使用 Lucene.Net 2.9.1.2 和 C# 的简单示例.如果您想保留计算出的分数,则需要修改示例.
You could write your own Collector that would ignore collecting those documents that the scorer places below your threshold. Below is a simple example of this using Lucene.Net 2.9.1.2 and C#. You'll need to modify the example if you want to keep the calculated score.
using System;
using System.Collections.Generic;
using Lucene.Net.Index;
using Lucene.Net.Search;
public class ScoreLimitingCollector : Collector {
private readonly Single _lowerInclusiveScore;
private readonly List<Int32> _docIds = new List<Int32>();
private Scorer _scorer;
private Int32 _docBase;
public IEnumerable<Int32> DocumentIds {
get { return _docIds; }
}
public ScoreLimitingCollector(Single lowerInclusiveScore) {
_lowerInclusiveScore = lowerInclusiveScore;
}
public override void SetScorer(Scorer scorer) {
_scorer = scorer;
}
public override void Collect(Int32 doc) {
var score = _scorer.Score();
if (_lowerInclusiveScore <= score)
_docIds.Add(_docBase + doc);
}
public override void SetNextReader(IndexReader reader, Int32 docBase) {
_docBase = docBase;
}
public override bool AcceptsDocsOutOfOrder() {
return true;
}
}
这篇关于在 Solr/Lucene 中删除低于某个分数阈值的结果?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!