用于不同字段的范围查询的mongodb索引策略 [英] mongodb index strategy for range query with different fields

查看:222
本文介绍了用于不同字段的范围查询的mongodb索引策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Almoust我的所有文档包括2个字段,开始时间戳和完成时间戳。在每个我的查询中,我需要获取在选定时间段内的元素。所以start应该在选定的值之后,final应该在选择的时间戳之前。

Almoust all my documents include 2 fields, start timestamp and finish timestamp. And in each my query I need to get elements which is in selected period of time. so start should be after selected value and final should be before selected timestamp.

查询看起来像

db.collection.find({start:{$gt:DateTime(...)}, final:{$lt:DateTime(...)}})

那么这种情况的最佳索引策略是什么?

So what the best indexing strategy for that scenario?

顺便说一句,这对性能更好 - 将日期存储为日期时间或unix时间戳,这是长值本身

By the way, which is better for performance - to store date as datetimes or as unix timestamps, which is long value itself

推荐答案

baloo 的答案中添加更多内容。

Too add a little more to baloo's answer.

关于时间戳与长期问题。通常,MongoDB服务器不会看到差异。 BSON编码长度相同(64位)。根据驱动程序的编码,您可能会在客户端看到不同的性能。例如,在Java端使用10gen驱动程序时,时间戳呈现为 Date ,其重量远远小于 Long 。有驱动程序试图避免这种开销。

On the time-stamp vs. long issue. Generally the MongoDB server will not see a difference. The BSON encoding length is the same (64 bits). You may see a performance different on the client side depending on the driver's encoding. As an example, on the Java side a using the 10gen driver a time-stamp is rendered as Date that is a lot heavier than Long. There are drivers that try to avoid that overhead.

另一个问题是,如果关闭索引的第一个字段的范围,您将看到性能改进。因此,如果您使用 baloo 建议的索引:

The other issue is that you will see a performance improvement if you close the range for the first field of the index. So if you use the index suggested by baloo:

db.collection.ensureIndex({start: 1, final: 1})

如果您查询的话,您的查询将会执行(可能更多):

You query will perform (potentially much) better if you query is:

db.collection.find({start:{$gt:DateTime(...),$lt:DateTime(...)}, 
                    final:{$lt:DateTime(...)}})

从概念上讲,如果将索引视为树,则闭合范围会限制树的两侧而不是仅限于一侧。如果没有封闭范围,服务器必须检查所有条目,其中 start 大于提供的时间戳,因为它不知道<$ c $之间的关系c>开始和最终

Conceptually, if you think of the indexes as a a tree the closed range limits both sides of the tree instead of just one side. Without the closed range the server has to "check" all of the entries with a start greater than the time stamp provided since it does not know of the relation between start and final.

您甚至可能会发现使用单个字段索引的查询性能并不好:

You may even find that that the query performance is no better using a single field index like:

db.collection.ensureIndex({start: 1})

最节省的资金来自第一场的修剪。不会出现这种情况的情况是索引覆盖了查询,或者结果的排序/排序可以从索引中得出。

Most of the savings is from the first field's pruning. The case where this will not be the case is when the query is covered by the index or the ordering/sort for the results can be derived from the index.

HTH - Rob。

HTH - Rob.

这篇关于用于不同字段的范围查询的mongodb索引策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆