MongoDB Multikey复合索引-需要帮助了解范围 [英] MongoDB Multikey Compound Index - Need Help Understanding Bounds

查看:140
本文介绍了MongoDB Multikey复合索引-需要帮助了解范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们最近决定重新访问某些MongoDB索引,并在使用包含多键部分的复合索引时遇到了奇怪的结果.

We've recently decided to revisit some of our MongoDB indexes and came across a peculiar result when using a compound index which contains a multikey part.

请务必注意,我们正在使用v2.4.5

It's important to note that we're using v2.4.5

TLDR :当使用具有多键部分的复合索引时,用于范围限制的非多键字段的边界将被删除.

TLDR: When using a compound index with multikey part, the bounds of a non-multikey field used for range restriction are dropped.

我将用一个例子来解释这个问题:

I'll explain the problem with an example:

创建一些数据

Create some data

db.demo.insert(
[{ "foo" : 1, "attr" : [  {  "name" : "a" },  {  "name" : "b" },  {  "name" : "c" } ]},
 { "foo" : 2, "attr" : [  {  "name" : "b" },  {  "name" : "c" },  {  "name" : "d" } ]},
 { "foo" : 3, "attr" : [  {  "name" : "c" },  {  "name" : "d" },  {  "name" : "e" } ]},
 { "foo" : 4, "attr" : [  {  "name" : "d" },  {  "name" : "e" },  {  "name" : "f" } ]}])

索引

Index

db.demo.ensureIndex({'attr.name': 1, 'foo': 1})

查询和解释

Query & Explain

查询"attr.name",但限制了非多键字段"foo"的范围:

Query on 'attr.name' but constrain the range of the non-multikey field 'foo':

db.demo.find({foo: {$lt:3, $gt: 1}, 'attr.name': 'c'}).hint('attr.name_1_foo_1').explain()
{
    "cursor" : "BtreeCursor attr.name_1_foo_1",
    "isMultiKey" : true,
    "n" : 1,
    "nscannedObjects" : 2,
    "nscanned" : 2,
    "nscannedObjectsAllPlans" : 2,
    "nscannedAllPlans" : 2,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 0,
    "indexBounds" : {
        "attr.name" : [
            [
                "c",
                "c"
            ]
        ],
        "foo" : [
            [
                -1.7976931348623157e+308,
                3
            ]
        ]
    }
}

如您所见,'foo'的范围未在查询中定义,一端被完全忽略,这导致nscanned大于其应有的范围.

As you can see, the range of 'foo' is not as defined in the query, one end is completely ignored which results in nscanned being larger than it should.

更改范围操作数的顺序将更改放置的结尾:

Changing the order of the range operands will alter the dropped end:

db.demo.find({foo: {$gt: 1, $lt:3}, 'attr.name': 'c'}).hint('attr.name_1_foo_1').explain()
{
    "cursor" : "BtreeCursor attr.name_1_foo_1",
    "isMultiKey" : true,
    "n" : 1,
    "nscannedObjects" : 2,
    "nscanned" : 2,
    "nscannedObjectsAllPlans" : 2,
    "nscannedAllPlans" : 2,
    "scanAndOrder" : false,
    "indexOnly" : false,
    "nYields" : 0,
    "nChunkSkips" : 0,
    "millis" : 0,
    "indexBounds" : {
        "attr.name" : [
            [
                "c",
                "c"
            ]
        ],
        "foo" : [
            [
                1,
                1.7976931348623157e+308
            ]
        ]
    }
}

我们或者错过了一些多键索引基础知识,或者我们遇到了一个错误.

We're either missing out on some multikey index basics, or we're facing a bug.

我们经历了类似的主题,包括:

We've gone through similar topics, including:

  • https://groups.google.com/forum/#!searchin/mongodb-user/multikey$20bounds/mongodb-user/RKrsyzRwHrE/_i0SxdJV5qcJ
  • Order of $lt and $gt in MongoDB range query

不幸的是,这些帖子解决了一个不同的用例,在该用例中,在多键值上设置了一个范围.

Unfortunately these posts address a different use-case where a range is set on the multikeyed value.

我们尝试做的其他事情:

Other things we've tried to do:

  • 从非多键字段开始,更改复合索引的顺序.

  • Change the compound index ordering, starting with the non-multikey field.

将'foo'值放入'attr'数组中的每个子文档中,通过('attr.name','attr.foo')进行索引,并在'attr'上执行$ elemMatch 'foo'的范围约束.

Put the 'foo' value inside each of the subdocuments in the 'attr' array, index by ('attr.name', 'attr.foo') and do an $elemMatch on 'attr' with a range constraint on 'foo'.

在定义范围时使用$ and运算符:

Use an $and operator when defining the range:

db.demo.find({'attr.name': 'c', $and: [{num: {$lt: 3}}, {num: {$gt: 1}}]})

  • 使用MongoDB v2.5.4

  • Use MongoDB v2.5.4

    以上所有方法均无效(v2.5.4通过完全抛弃范围的两端使情况变得更糟).

    None of the above had any effect (v2.5.4 made things worse by dumping both ends of the range completely).

    我们将不胜感激!

    非常感谢,

    Roi

    推荐答案

    对于复合索引(其中索引字段之一是数组),MongoDB将仅对范围查询使用下限或上限,以确保返回正确的匹配项.请参见 SERVER-958 ,该示例找不到同时限制上下索引范围的示例预期的文件.

    With compound indexes where one of the indexed fields is an array, MongoDB will only use either a lower or upper bound for the range query to ensure correct matches are returned. See SERVER-958 for an example where constraining to both upper and lower index bounds would not find the expected document.

    如果范围查询在数组字段上,则可以使用 $elemMatch 运算符可在预期的索引范围内优化您的查询.与MongoDB 2.4一样,$elemMatch运算符不适用于非数组字段,因此很遗憾,这对您的用例没有帮助.您可以观看/支持 SERVER-6050:考虑允许$ elemMatch应用于非数组 MongoDB问题跟踪器.

    If your range query is on the array field you can potentially use the $elemMatch operator to optimise your query within the expected index bounds. As at MongoDB 2.4, the $elemMatch operator does not work on non-array fields so unfortunately this doesn't help your use case. You can watch/upvote SERVER-6050: Consider allowing $elemMatch applied to non arrays in the MongoDB issue tracker.

    还有一个未解决的问题 SERVER-7959:当某些字段位于多键描述这种行为.

    There is also an open issue SERVER-7959: Potentially unexpected scans with compound indexes when some fields are multikey describing this behaviour.

    这篇关于MongoDB Multikey复合索引-需要帮助了解范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆