MongoDB多个索引与子文档数组上的单个索引? [英] MongoDB Many Indexes vs. Single Index on array of Sub-Documents?

查看:295
本文介绍了MongoDB多个索引与子文档数组上的单个索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想知道哪种方法可以更有效地索引我需要跟踪的文档的各种时间戳,请记住我的应用程序在编写时相当繁重,但是在没有索引的情况下阅读量很大,查询也是如此慢。

Wondering which would be the more efficient technique for indexing my document's various timestamps that I need to keep track of, keeping in mind my application is fairly heavy on writing, but heavy enough on reading that without the indexes, the queries are too slow.

为每个时间戳设置一个字段并为每个字段编制索引,或者将时间戳及其相关类型存储在数组字段中,并为每个字段编制索引更好吗?那个数组?

Is it better to have a field for each timestamp, and index each field, or store the timestamps and their associated type in an array field, and index each field of that array?

第一个选项,单独的字段和每个字段的索引:

First option, separate fields, and an index for each:

{
    "_id" : "...",
    "Field1.Timestamp" : '2011-01-01 01:00.000',
    "Field2.Timestamp" : '2011-01-01 01:00.000',
    "Field3.Timestamp" : '2011-01-01 01:00.000',
    "Field4.Timestamp" : '2011-01-01 01:00.000',
    "Field5.Timestamp" : '2011-01-01 01:00.000',
    "Field6.Timestamp" : '2011-01-01 01:00.000',
    "Field7.Timestamp" : '2011-01-01 01:00.000',
    "Field8.Timestamp" : '2011-01-01 01:00.000',
    "Field9.Timestamp" : '2011-01-01 01:00.000',
}

db.mycollection.ensureIndex({ "Field1.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field2.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field3.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field4.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field5.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field6.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field7.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field8.Timestamp" : 1 });
db.mycollection.ensureIndex({ "Field9.Timestamp" : 1 });

然后有一系列时间戳及其状态,只有一个索引

Then there's an array of the timestamps and their status, with only a single index

{
    "_id" : "...",
    "Timestamps" : [
        { "Type" : "Field1", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field2", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field3", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field4", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field5", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field6", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field7", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field8", "Timestamp" : '2011-01-01  01:00.000' },
        { "Type" : "Field9", "Timestamp" : '2011-01-01  01:00.000' },
    ]
}

db.mycollection.ensureIndex({ "Timestamps.Type" : 1, "Timestamps.Timestamp" : 1 });

我是不是在这里?或者哪种方式更好

Am I way off the mark here? or which would be the better way

推荐答案

这基本上归结为如果10个大小为N的索引比一个索引更有效大小N * 10.如果你纯粹看看读取,那么单独的索引应该总是更快。相关的b树行走将检查较小的密钥集等。

This basically boils down to if 10 index of size N are more efficient than one index of size N * 10. If you purely look at reads then the seperate indexes should always be faster. The associated b-tree walks will examine a smaller keyset etc.

但有几点需要考虑:


  • 数组字段上的索引基本上单独索引每个数组元素。因此,在b树行走期间,查找开销最多将是1-2个额外的步骤,这是可忽略的性能损失。换句话说,它们几乎一样快。

  • 有10个索引可能意味着每个更新/插入都需要更新多个索引(取决于你的索引是否共享一个字段)或者如果一次更新超过1个时间戳)。这是一个重要的性能考虑因素。

  • 使用数组索引可以更容易地添加其他时间戳(例如Timestamp10)。

  • 有一个限制每个数据库可以使用的命名空间数(24k),每个索引占用一个。如果你为每个字段创建一个单独的索引,这可能会成为一个问题。

  • 最重要的是,数组索引更直接,可以简化代码,从而简化可维护性。鉴于性能差异有限,我认为这是获取数组索引的最大动力。

  • Indexes on array fields basically index each array element seperately. As such the lookup overhead will at most be 1-2 additional steps during the b-tree walk which is a negligible performance hit. In other words, they'll be almost as fast.
  • Having 10 indexes may mean each update/insert will require more than one index to be updated (depending on if your indexes share a field or if you update more than 1 timestamp at a time). This is a significant performance consideration.
  • Using an array index makes it a bit easier to add additional timestamps (e.g. Timestamp10).
  • There is a limit to the number of namespaces you can use per database (24k) and each index takes up one. If you make a seperate index per field this might become an issue.
  • Most importantly, the array index is way more straightforward and will simplify your code and thus maintainability. Given the limited performance differences I'd say this is the strongest motivation to go for an array index here.

这篇关于MongoDB多个索引与子文档数组上的单个索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆