具有索引字段的MongoDB正则表达式 [英] MongoDB regular expression with indexed field

查看:131
本文介绍了具有索引字段的MongoDB正则表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用MongoDB创建我的第一个应用程序。
为字段创建索引,并尝试使用$ regex param查找查询,在shell中启动

I was creating my first app using MongoDB. Created index for a field, and tried a find query with $regex param, launched in a shell

> db.foo.find({A:{$regex:'BLABLA!25500[0-9]'}}).explain()
{
        "cursor" : "BtreeCursor A_1 multi",
        "nscanned" : 500001,
        "nscannedObjects" : 10,
        "n" : 10,
        "millis" : 956,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {
                "A" : [
                        [
                                "",
                                {

                                }
                        ],
                        [
                                /BLABLA!25500[0-9]/,
                                /BLABLA!25500[0-9]/
                        ]
                ]
        }
}

这很奇怪,因为当我启动相同的查询,但收集中没有索引时,性能更好r。

It's very strange, because when i'm launching the same query, but with no index in collection, the performance is much better.

> db.foo.find({A:{$regex:'BLABLA!25500[0-9]'}}).explain()
{
        "cursor" : "BasicCursor",
        "nscanned" : 500002,
        "nscannedObjects" : 500002,
        "n" : 10,
        "millis" : 531,
        "nYields" : 0,
        "nChunkSkips" : 0,
        "isMultiKey" : false,
        "indexOnly" : false,
        "indexBounds" : {

        }
}

显然,使用没有正则表达式的索引搜索字段的速度要快得多(即使用常量字段搜索文档) ,但我真的对这种行为的原因感兴趣。

Obviously, searching a field with index without regex is working much faster(i.e. searching document with constant field) , but i'm really interested in reason of such behavior.

推荐答案

这里性能差异的原因很可能是启用索引后,您的查询必须遍历索引(加载到内存中),然后加载匹配的文档以便返回到内存中。由于您未使用前缀查询,因此将根据正则表达式扫描和测试索引中的所有值。效率不高。

The reason for the performance differential here is likely that, with the index enabled, your query must traverse the index (load into memory), then load the matching documents to be returned into memory also. Since you are not using the prefix query all values in the index will be scanned and tested against the regular expression. Not very efficient.

当您删除索引时,您只是在进行表扫描并在那里匹配正则表达式 - 基本上您可以稍微简化第一个。

When you remove the index you are just doing a table scan and matching the regex there - essentially you simplified things from the first one slightly.

如果索引版本是覆盖索引查询,如果这是一个复合索引,你可能会更快,你需要将它与另一个字段的标准结合起来。

You might be able to make the indexed version quicker if it were a covered index query, it would also likely be faster if this were a compound index and you needed to combine it with the criteria for another field.

当你使用前缀查询时,并不是它只使用索引,而是你有效地使用索引,这是关键,因此你看到真实的性能提升。

When you use a prefix query, it's not that it only uses an index then, but you use the index efficiently, which is key, and hence you see the real performance gains.

这篇关于具有索引字段的MongoDB正则表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆