不同字段范围查询的mongodb索引策略 [英] mongodb index strategy for range query with different fields

查看:16
本文介绍了不同字段范围查询的mongodb索引策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我几乎所有的文档都包含 2 个字段,开始时间戳和结束时间戳.在我的每个查询中,我都需要获取选定时间段内的元素.所以 start 应该在选定的值之后,final 应该在选定的时间戳之前.

Almoust all my documents include 2 fields, start timestamp and finish timestamp. And in each my query I need to get elements which is in selected period of time. so start should be after selected value and final should be before selected timestamp.

查询看起来像

db.collection.find({start:{$gt:DateTime(...)}, final:{$lt:DateTime(...)}})

那么对于这种情况最好的索引策略是什么?

So what the best indexing strategy for that scenario?

顺便说一句,这对性能更好 - 将日期存储为日期时间或 unix 时间戳,这本身就是长值

By the way, which is better for performance - to store date as datetimes or as unix timestamps, which is long value itself

推荐答案

baloo 的回答也加了一点.

关于时间戳与长问题.通常 MongoDB 服务器不会看到差异.BSON 编码长度相同(64 位).根据驱动程序的编码,您可能会在客户端看到不同的性能.例如,在使用 10gen 驱动程序的 Java 端,时间戳呈现为比 Long 重得多的 Date.有 驱动程序 试图避免这种开销.

On the time-stamp vs. long issue. Generally the MongoDB server will not see a difference. The BSON encoding length is the same (64 bits). You may see a performance different on the client side depending on the driver's encoding. As an example, on the Java side a using the 10gen driver a time-stamp is rendered as Date that is a lot heavier than Long. There are drivers that try to avoid that overhead.

另一个问题是,如果您关闭索引第一个字段的范围,您将看到性能提升.所以如果你使用baloo建议的索引:

The other issue is that you will see a performance improvement if you close the range for the first field of the index. So if you use the index suggested by baloo:

db.collection.ensureIndex({start: 1, final: 1})

如果您的查询是:

db.collection.find({start:{$gt:DateTime(...),$lt:DateTime(...)}, 
                    final:{$lt:DateTime(...)}})

从概念上讲,如果您将索引视为一棵树,则封闭范围会限制树的两侧,而不仅仅是一侧.如果没有封闭范围,服务器必须检查"所有 start 大于提供的时间戳的条目,因为它不知道 start 和 <代码>最终.

Conceptually, if you think of the indexes as a a tree the closed range limits both sides of the tree instead of just one side. Without the closed range the server has to "check" all of the entries with a start greater than the time stamp provided since it does not know of the relation between start and final.

您甚至可能会发现使用单个字段索引的查询性能并没有更好,例如:

You may even find that that the query performance is no better using a single field index like:

db.collection.ensureIndex({start: 1})

大部分节省来自第一个字段的修剪.不会出现这种情况的情况是查询被索引覆盖,或者结果的排序/排序可以从索引中得出.

Most of the savings is from the first field's pruning. The case where this will not be the case is when the query is covered by the index or the ordering/sort for the results can be derived from the index.

HTH - 抢劫.

这篇关于不同字段范围查询的mongodb索引策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆