Lucene中的规范是什么 [英] What are norms in Lucene
问题描述
我不了解它们是什么,真的很希望能得到一个简单的解释,来说明它们为世界带来了什么价值,而没有太多关于它们如何工作的实现细节.
I don't understand what they are, and would really appreciate a simple explanation showing what value they bring to the world without too much implementation detail of how they work.
推荐答案
规范是分数计算的一部分.可以根据您的喜好来计算该范数.使规范与众不同的主要因素是它在索引时间进行计算.通常,会根据文档与查询的匹配程度,在查询时计算其他影响得分的因素. norm
通过与文档一起存储来节省查询性能.
A norm is part of the calculation of a score. The norm could be calculated however you like, really. The main thing that sets the norm apart, is it's calculated at index-time. Generally, other factors influencing score are calculated at query time, based on how well the document matches the query. The norm
saves on query performance by being stored along with the document, instead.
The standard implementation can be found, and well described, in Lucene's TFIDFSimilarity. There, it is the product of the set field boost (or the product of all fields boosts, if multiple have been set on the field) and "lengthNorm" (which is a calculated factor designed to weigh matches on shorter documents more heavily). Neither of these is dependent on the makeup of the query, and so are good choices to be calculated and stored at index time instead.
然后将它们以压缩的,高度有损的单字节格式存储(精度大约为1个有效十进制数字).
They are then stored in a compressed, and highly lossy, single-byte format (with approx. 1 significant decimal digit of accuracy).
这篇关于Lucene中的规范是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!