使用$ or运算符时MongoDB查询变慢 [英] MongoDB query to slow when using $or operator

查看:187
本文介绍了使用$ or运算符时MongoDB查询变慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对我的收藏集 Audios

I'm trying to make this query to my collection Audios

    var querySlow = {
        "palabra": {
            $regex: "^" + keywords,
            "$options": "i"
        },
        $or: [{
            "_p_pais": {
                $in: interested_accents
            }
        }, {
            "languageCodeTatoeba": {
                $in: interested_accents_tatoeba
            }
        }]
    }; // takes 20 seconds

这实际上真的很慢,但是如果我删除任何 $ or ,则它非常快,例如:

This is actually really really slow but if I remove any of the $or, it is very very fast, for example:

    var queryFast1 = {
        "palabra": {
            $regex: "^" + keywords,
            "$options": "i"
        },
        $or: [{
            "_p_pais": {
                $in: interested_accents
            }
        }]
    }; // takes less than 1 second

或这个

    var queryFast2 = {
        "palabra": {
            $regex: "^" + keywords,
            "$options": "i"
        },
        $or: [{
            "languageCodeTatoeba": {
                $in: interested_accents_tatoeba
            }
        }]
    }; // takes less than 1 second

这是慢查询的 .explain():

http://pastebin.com/nrhjB1wf

我实际上不知道如何管理索引,我应该为此集合创建索引吗?

I actually don't know how to manage the indexes, should I create an index to this collection??

推荐答案

查询和索引存在一些问题:

There are some issues with your query and indexes:

1.$ or使用不同的索引

MongoDB仅对一个查询使用一个索引,但涉及到 $ or 子句的查询除外.在索引策略页面:

MongoDB only uses one index for a query, with the exception of queries involving an $or clause. From the Indexing Strategies page:

通常,MongoDB仅使用一个索引来满足大多数查询.但是,$ or查询的每个子句都可以使用不同的索引

Generally, MongoDB only uses one index to fulfill most queries. However, each clause of an $or query may use a different index

也来自 $ or子句和索引页面:

也就是说,为了使MongoDB使用索引来评估$ or表达式,$ or表达式中的所有子句都必须由索引支持.

That is, for MongoDB to use indexes to evaluate an $or expression, all the clauses in the $or expression must be supported by indexes.

关于您的查询,您可以尝试重新排列查询以使 $ or 子句成为顶级子句:

With regard to your query, you could try to rearrange the query to make the $or clause a top-level clause:

{$or: [
    {"palabra": {...}, "_p_pais": {...} },
    {"palabra": {...}, "languageCodeTatoeba": {...}}
]}

在这种形式下,MongoDB可以使用两个索引:

In this form, MongoDB can use two indexes:

  • 具有 palabra _p_pais 术语的复合索引,以及
  • 具有 palabra languageCodeTatoeba 条款的复合索引
  • Compound index with palabra and _p_pais terms, and
  • Compound index with palabra and languageCodeTatoeba terms

请使用 explain("executionStats")检查索引是否正确使用.您要最小化的关键指标是文档数( nReturned )与已检查的总文档数/键数.该比率越接近1,您的查询就越有选择性,并且性能也越好.

Please use explain("executionStats") to check if the indexes are used correctly. The key metric you want to minimize is the number of documents (nReturned) vs. total docs/keys examined. The closer the ratio is to 1, the more selective your query is, and the better the performance.

例如,如果MongoDB必须检查1000个文档( totalDocsExamined:1000 ),但仅返回10个文档( nReturned:10 ),那么您的查询不是选择性的(即比率为10/1000).理想查询的比率应接近或等于1,例如 nReturned:10,totalDocsExamined:10 ,比率为1(10/10).

For example, if MongoDB has to examine 1000 docs (totalDocsExamined: 1000), but only return 10 document (nReturned: 10), then your query is not very selective (i.e. a ratio of 10/1000). Ideal queries would have a ratio close to or equal to 1, e.g. nReturned: 10, totalDocsExamined: 10, a ratio of 1 (10/10).

有关 explain()的更多信息,请参见:

For more information regarding explain(), please see:

2.索引太多

索引过多可能导致:

  • 查询计划者选择次优索引,因为它们看起来都一样,因此不知道要使用哪个索引.
  • 插入/更新性能相对较慢,因为对索引中包含的字段的每次插入/更新也将需要对索引进行插入/更新.

从您发布的解释结果中,您至少在集合中具有以下索引:

From the explain result you posted, you have at least these indexes in the collection:

_p_pais_-1__p_user_-1__created_at_-1
languageCodeTatoeba_1_lowercase_1
languageCodeTatoeba_1
languageCodeTatoeba_-1
_p_pais_-1
_p_pais_1_languageCodeTatoeba_1
palabra_-1
palabra_1__created_at_-1

这组索引有两个问题:

  1. 在索引中,有些是多余的.例如, languageCodeTatoeba_1 (升序索引)和 languageCodeTatoeba_-1 (降序索引)实际上是相同的索引.可以删除其中之一,而不会影响查询性能.
  2. 很多索引是另一个的前缀.例如, palabra_-1 palabra_1__created_at _ .可以删除 palabra_-1 索引,因为它是 palabra_1__created_at _ 索引的前缀.有关更多详细信息,请参见化合物索引:前缀页./li>
  1. Among the indexes, some are redundant. For example, languageCodeTatoeba_1 (an ascending index) and languageCodeTatoeba_-1 (a descending index) are practically the same index. One of them can be removed without any effect on query performance.
  2. A lot of indexes are prefix of another. For example, palabra_-1 and palabra_1__created_at_. The palabra_-1 index can be removed, since it is the prefix of the palabra_1__created_at_ index. Please see the Compound Index: Prefix page for more details.

粗略地看一下,您也许可以将索引列表修整为仅包含这4个索引,而不是8个:

From a cursory glance, you may be able to trim your index list to only contain these 4 indexes instead of 8:

_p_pais_-1__p_user_-1__created_at_-1
languageCodeTatoeba_1_lowercase_1
_p_pais_1_languageCodeTatoeba_1
palabra_1__created_at_-1

有关索引的更多信息,请参见以下链接:

Please see the following links for more information regarding indexes:

3.为什么从 $ or 词中删除一个子句会加快查询速度

3. Why removing one clause from the $or term speeds up the queries

这是因为查询

{"palabra": {...}, $or: [{"_p_pais": {...}}]}

本质上与

{"palabra": {...}, "_p_pais": {...}}

假设您有一个复合索引,例如 palabra_1__p_pais_1 ,MongoDB将能够使用该索引.

Assuming you have a compound index such as palabra_1__p_pais_1, MongoDB would be able to use that index.

类似地,

{"palabra": {...}, $or: [{"languageCodeTatoeba": {...}}]}

本质上与

{"palabra": {...}, "languageCodeTatoeba": {...}}

此查询可以使用您已经在集合中的 _p_pais_1_languageCodeTatoeba_1 索引.

This query could use the _p_pais_1_languageCodeTatoeba_1 index, which you already have in your collection.

简而言之,这两个查询很快,因为您删除了 $ or 子句,从而使MongoDB可以使用正确的索引.

In short, those two queries are fast because you removed the $or clause, enabling MongoDB to use the correct index.

这篇关于使用$ or运算符时MongoDB查询变慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆