Lucene NumericUtils.IntToPrefixCoded的意义是什么 [英] What's the point of Lucene NumericUtils.IntToPrefixCoded
问题描述
我一直在研究Subtext的Lucene.Net实现,作为对我们的网站进行类似操作的指南.当Subtext索引或搜索给定帖子时,它将通过NumericUtils.IntToPrefixCoded运行ID. 根据Lucene的文档,它可以进行一些调整,但不会失去精度. 那么,有什么意义呢?它是做什么的,为什么?
I've been looking at Subtext's Lucene.Net implementation as a guide to do something similar with our websites. When Subtext index or search for a given post, it runs the ID through NumericUtils.IntToPrefixCoded. According to the Lucene docs, it does some shifting, but doesn't lose precision. So, what's the point? What does it do, and why?
推荐答案
You need to look at the class documentation, which explains it in more detail:
要在Apache Lucene中快速执行范围查询,需要对范围进行划分 递归分成多个间隔进行搜索: 仅在三叉戟中以最低的精度搜索范围, 而边界更精确地匹配.这减少了数量 的条款.
To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.
此类生成用于实现此目的的术语:首先是数字
整数值需要转换为字符串.对于那个整数
值(32位或64位)设置为无符号,并转换这些位
每7位转换为ASCII字符.产生的字符串是可排序的,例如
原始整数值.每个值也都带有前缀(在第一个值中
char)由shift
值(删除的位数)使用
在编码过程中.
This class generates terms to achieve this: First the numerical
integer values need to be converted to strings. For that integer
values (32 bit or 64 bit) are made unsigned and the bits are converted
to ASCII chars with each 7 bit. The resulting string is sortable like
the original integer value. Each value is also prefixed (in the first
char) by the shift
value (number of bits removed) used
during encoding.
据我所知,intToPrefixCoded
方法确实做到了这一点:取int
值,将其移位并返回可排序的String
,如上所述.
As I understand, intToPrefixCoded
method does exactly that: takes int
value, shifts it and returns a sortable String
as explained above.
这篇关于Lucene NumericUtils.IntToPrefixCoded的意义是什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!