Lucene RangeQuery 没有正确过滤 [英] Lucene RangeQuery doesn't filter appropriately

查看:26
本文介绍了Lucene RangeQuery 没有正确过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 RangeQuery 来获取数量在 0 到 2 之间的所有文档.当我执行查询时,Lucene 也会给我数量大于 2 的文档.我在这里错过了什么?

I'm using RangeQuery to get all the documents which have amount between say 0 to 2. When i execute the query, Lucene gives me documents which have amount greater than 2 also. What am I missing here?

这是我的代码:

Term lowerTerm = new Term("amount", minAmount);
Term upperTerm = new Term("amount", maxAmount);

RangeQuery amountQuery = new RangeQuery(lowerTerm, upperTerm, true);

finalQuery.Add(amountQuery, BooleanClause.Occur.MUST);

这是我索引中的内容:

doc.Add(new Field("amount", amount.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.YES));

推荐答案

UPDATE:就像@basZero在他的评论中所说,从Lucene 2.9开始,你可以添加数字字段到您的文档.只要记住使用 NumericRangeQuery 搜索时代替 RangeQuery.

UPDATE: Like @basZero said in his comment, starting with Lucene 2.9, you can add numeric fields to your documents. Just remember to use NumericRangeQuery instead of RangeQuery when you search.

Lucene 将数字视为单词,因此它们的顺序是字母顺序:

Lucene treats numbers as words, so their order is alphabetic:

0
1
12
123
2
22

这意味着对于 Lucene,12 介于 0 和 2 之间.如果要进行适当的数字范围,则需要索引数字零填充,然后执行 [0000 TO 0002] 的范围搜索.(您需要的填充量取决于预期的值范围).

That means that for Lucene, 12 is between 0 and 2. If you want to do a proper numerical range, you need to index the numbers zero-padded, then do a range search of [0000 TO 0002]. (The amount of padding you need depends on the expected range of values).

如果您有负数,只需为非负数添加另一个零.(错错了.查看更新)

If you have negative numbers, just add another zero for non-negative numbers. ( WRONG WRONG WRONG. See update)

如果您的数字包含小数部分,请保持原样,仅对整数部分进行零填充.

If your numbers include a fraction part, leave it as is, and zero-pad the integer part only.

例子:

<罢工>

-00002.12
-00001

000000
000001
000003.1415
000022

更新:负数有点棘手,因为 -1 按字母顺序排在 -2 之前.这篇文章给出了关于在 Lucene 中处理负数和一般数字的完整解释.基本上,您必须将数字编码"成使项目的顺序有意义的东西.

UPDATE: Negative numbers are a bit tricky, since -1 comes before -2 alphabetically. This article gives a complete explanation about dealing with negative numbers and numbers in general in Lucene. Basically, you have to "encode" numbers into something that makes the order of the items make sense.

这篇关于Lucene RangeQuery 没有正确过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆