如何在MongoDB聚合查询中使用$ hint? [英] How to use $hint in MongoDB aggregation query?

查看:169
本文介绍了如何在MongoDB聚合查询中使用$ hint?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在ubuntu机器上使用mongo v3.0.1.我收藏了3亿行.我根据查询首选项创建了两个索引.

当我尝试使用explain运行聚合时,它正在使用效率低下的索引,这就是为什么它需要花费20到25秒的时间.有没有办法放置$hint,以便我的聚合查询使用适当的索引.

$match处于我的第一个流水线阶段.我有两个索引:

  1. "Host_-1_SiteType_-1"

  2. "VisitTime_-1_AccountId_-1_Host_-1_SiteType_-1_Extension_-1_LifeTime_-1"

和我的$match管道就像:

 { "$match" : {
    "AccountId": accID, 
    "VisitTime": { "$lte" : today, "$gte" : last365Days },
    "$or": [
        { "$and": [
            { "Extension":{ "$in": ["chrome_0","firefox_0"] }},
            { "LifeTime": 0 }
        ]},
        {  "LifeTime": { "$gt": 1000 }}
    ],
    "Host": { "$ne": "localhost" },
    "SiteType" : { "$exists": true },
}
 

,它正在使用第一个索引,而不是第二个索引.第一个索引花费的时间为50秒,而仅使用第二个索引所花费的时间仅为18秒.

这是我的文档样本之一:

 { 
    "_id" : "2bc1143c-07e4-4c37-a020-a7485b2802a3", 
    "CreatedDate" : ISODate("2015-07-22T04:05:06.802+0000"), 
    "UpdatedDate" : ISODate("2015-07-22T05:28:26.469+0000"), 
    "AccountId" : accID, 
    "Url" : "http://www.test.com/test.html", 
    "Host" : "test.com", 
    "VisitTime" : ISODate("2014-08-12T18:08:25.813+0000"), 
    "LifeTime" : 789546.01, 
    "Status" : "closed", 
    "LocalTime" : ISODate("2014-08-12T18:08:25.813+0000"), 
    "DeviceId" : "123456789", 
    "Extension" : "firefox_0", 
    "SubSiteType" : "TestSubSite", 
    "SiteType" : "TestSite", 
    "Flag" : "1"
}
 

这是我的汇总说明:

 {
    "stages" : [
        {
            "$cursor" : {
                "query" : {
                    "AccountId" : "accID",
                    "VisitTime" : {
                        "$lte" : "2015-07-25T18:30:00Z",
                        "$gte" : "2014-07-25T18:30:00Z"
                    },
                    "Host" : {
                        "$ne" : "localhost"
                    },
                    "SiteType" : {
                        "$exists" : true
                    },
                    "$or" : [
                        {
                            "$and" : [
                                {
                                    "Extension" : {
                                        "$in" : [
                                            "chrome_0",
                                            "firefox_0"
                                        ]
                                    }
                                },
                                {
                                    "LifeTime" : 0
                                }
                            ]
                        },
                        {
                            "LifeTime" : {
                                "$gt" : 1000
                            }
                        }
                    ]
                },
                "fields" : {
                    "Host" : 1,
                    "_id" : 0
                },
                "queryPlanner" : {
                    "plannerVersion" : 1,
                    "namespace" : "Test",
                    "indexFilterSet" : false,
                    "parsedQuery" : {
                        "$and" : [
                            {
                                "$or" : [
                                    {
                                        "$and" : [
                                            {
                                                "LifeTime" : {
                                                    "$eq" : 0
                                                }
                                            },
                                            {
                                                "Extension" : {
                                                    "$in" : [
                                                        "chrome_0",
                                                        "firefox_0"
                                                    ]
                                                }
                                            }
                                        ]
                                    },
                                    {
                                        "LifeTime" : {
                                            "$gt" : 1000
                                        }
                                    }
                                ]
                            },
                            {
                                "$not" : {
                                    "Host" : {
                                        "$eq" : "localhost"
                                    }
                                }
                            },
                            {
                                "VisitTime" : {
                                    "$lte" : "2015-07-25T18:30:00Z"
                                }
                            },
                            {
                                "AccountId" : {
                                    "$eq" : "accID"
                                }
                            },
                            {
                                "VisitTime" :"2014-07-25T18:30:00Z"

                            },
                            {
                                "SiteType" : {
                                    "$exists" : true
                                }
                            }
                        ]
                    },
                    "winningPlan" : {
                        "stage" : "FETCH",
                        "filter" : {
                            "$and" : [
                                {
                                    "SiteType" : {
                                        "$exists" : true
                                    }
                                },
                                {
                                    "$or" : [
                                        {
                                            "$and" : [
                                                {
                                                    "LifeTime" : {
                                                        "$eq" : 0
                                                    }
                                                },
                                                {
                                                    "Extension" : {
                                                        "$in" : [
                                                            "chrome_0",
                                                            "firefox_0"
                                                        ]
                                                    }
                                                }
                                            ]
                                        },
                                        {
                                            "LifeTime" : {
                                                "$gt" : 1000
                                            }
                                        }
                                    ]
                                },
                                {
                                    "VisitTime" : {
                                        "$lte" : "2015-07-25T18:30:00Z"
                                    }
                                },
                                {
                                    "AccountId" : {
                                        "$eq" : "accID"
                                    }
                                },
                                {
                                    "VisitTime" : {
                                        "$gte" : "2014-07-25T18:30:00Z"
                                    }
                                }
                            ]
                        },
                        "inputStage" : {
                            "stage" : "IXSCAN",
                            "keyPattern" : {
                                "Host" : -1,
                                "SiteType" : -1
                            },
                            "indexName" : "Host_-1_SiteType_-1",
                            "isMultiKey" : false,
                            "direction" : "forward",
                            "indexBounds" : {
                                "Host" : [
                                    "[MaxKey, \"localhost\")",
                                    "(\"localhost\", MinKey]"
                                ],
                                "SiteType" : [
                                    "[MaxKey, MinKey]"
                                ]
                            }
                        }
                    },
                    "rejectedPlans" : [
                        {
                            "stage" : "FETCH",
                            "filter" : {
                                "$and" : [
                                    {
                                        "SiteType" : {
                                            "$exists" : true
                                        }
                                    },
                                    {
                                        "$or" : [
                                            {
                                                "$and" : [
                                                    {
                                                        "LifeTime" : {
                                                            "$eq" : 0
                                                        }
                                                    },
                                                    {
                                                        "Extension" : {
                                                            "$in" : [
                                                                "chrome_0",
                                                                "firefox_0"
                                                            ]
                                                        }
                                                    }
                                                ]
                                            },
                                            {
                                                "LifeTime" : {
                                                    "$gt" : 1000
                                                }
                                            }
                                        ]
                                    }
                                ]
                            },
                            "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                    "VisitTime" : -1,
                                    "AccountId" : -1,
                                    "Host" : -1,
                                    "SiteType" : -1,
                                    "Extension" : -1,
                                    "LifeTime" : -1
                                },
                                "indexName" : "VisitTime_-1_AccountId_-1_Host_-1_SiteType_-1_Extension_-1_LifeTime_-1",
                                "isMultiKey" : false,
                                "direction" : "forward",
                                "indexBounds" : {
                                    "VisitTime" : [
                                        "[new Date(1437849000000), new Date(1406313000000)]"
                                    ],
                                    "AccountId" : [
                                        "[\"accID\", \"accID\"]"
                                    ],
                                    "Host" : [
                                        "[MaxKey, \"localhost\")",
                                        "(\"localhost\", MinKey]"
                                    ],
                                    "SiteType" : [
                                        "[MaxKey, MinKey]"
                                    ],
                                    "Extension" : [
                                        "[MaxKey, MinKey]"
                                    ],
                                    "LifeTime" : [
                                        "[MaxKey, MinKey]"
                                    ]
                                }
                            }
                        }
                    ]
                }
            }
        },
        {
            "$group" : {
                "_id" : "$Host",
                "Count" : {
                    "$sum" : {
                        "$const" : 1
                    }
                }
            }
        },
        {
            "$sort" : {
                "sortKey" : {
                    "Count" : -1
                },
                "limit" : 5
            }
        },
        {
            "$project" : {
                "_id" : false,
                "Host" : "$_id",
                "TotalVisit" : "$Count"
            }
        }
    ],
    "ok" : 1
}
 

解决方案

索引定义可能是非常主观的,您不能随便说索引此内容",然后希望取得最佳效果.实际上,它需要对其应用的搜索过程进行一些思考.

您的查询似乎由这些主要元素组成,主要是帐户"和生命周期"值.当然,那里肯定还有其他东西,例如"VisitTime",但以旧的图书馆和卡索引为类比,然后考虑一下该过程.

因此,当您走进图书馆的门口时,会看到两个卡片索引系统:

  1. 按创作日期包含库中的书籍,使您可以根据日期选择指向书籍的卡片

  2. 包含书籍作者的姓名以及图书馆中的位置.

现在,考虑到您知道要查找过去十年来撰写的作者的书,那么您选择哪种索引系统?那么,您是否浏览了10年的日期并寻找其中包含的作者?或者,您宁愿先查找作者,然后再缩小过去十年中写过哪些书?

过去10年中有很多内容比单作者的内容多.因此,2是更好的选择,因为一旦您拥有该作者的所有书籍,然后仔细检查卡片以在10年内找到它们,那将是一件小得多的任务.

这就是为什么索引中的键顺序对您正在使用的查询模式很重要的原因.显然,帐户"应该是最缩小选择范围的内容,然后是其他详细信息以帮助进一步缩小选择范围.

在此之前放置"VisitTime"之类的内容意味着您需要先筛选该时段内您可能不需要的所有内容,然后才能真正找到所需的内容.

排序很重要,您需要始终在索引设计中考虑这一点.

I am using mongo v3.0.1 on a ubuntu machine. And I have a collection of 300million rows. I have created two indexes based on my query preference.

When I am trying to run aggregation with explain, It is taking the inefficient index, and that is why it's taking 20-25 secs more time. Is there any way to put $hint, so that my aggregation query use the appropriate index.

$match is in my first pipeline stage. I have two indexes:

  1. "Host_-1_SiteType_-1"

  2. "VisitTime_-1_AccountId_-1_Host_-1_SiteType_-1_Extension_-1_LifeTime_-1"

and my $match pipeline is like :

{ "$match" : {
    "AccountId": accID, 
    "VisitTime": { "$lte" : today, "$gte" : last365Days },
    "$or": [
        { "$and": [
            { "Extension":{ "$in": ["chrome_0","firefox_0"] }},
            { "LifeTime": 0 }
        ]},
        {  "LifeTime": { "$gt": 1000 }}
    ],
    "Host": { "$ne": "localhost" },
    "SiteType" : { "$exists": true },
}

and it is using first index, instead of second index. and the time taken by the first index in 50 secs where as using second index only it is taking only 18 secs.

Here is my one of the document sample:

{ 
    "_id" : "2bc1143c-07e4-4c37-a020-a7485b2802a3", 
    "CreatedDate" : ISODate("2015-07-22T04:05:06.802+0000"), 
    "UpdatedDate" : ISODate("2015-07-22T05:28:26.469+0000"), 
    "AccountId" : accID, 
    "Url" : "http://www.test.com/test.html", 
    "Host" : "test.com", 
    "VisitTime" : ISODate("2014-08-12T18:08:25.813+0000"), 
    "LifeTime" : 789546.01, 
    "Status" : "closed", 
    "LocalTime" : ISODate("2014-08-12T18:08:25.813+0000"), 
    "DeviceId" : "123456789", 
    "Extension" : "firefox_0", 
    "SubSiteType" : "TestSubSite", 
    "SiteType" : "TestSite", 
    "Flag" : "1"
}

and here is my aggregation explanation:

{
    "stages" : [
        {
            "$cursor" : {
                "query" : {
                    "AccountId" : "accID",
                    "VisitTime" : {
                        "$lte" : "2015-07-25T18:30:00Z",
                        "$gte" : "2014-07-25T18:30:00Z"
                    },
                    "Host" : {
                        "$ne" : "localhost"
                    },
                    "SiteType" : {
                        "$exists" : true
                    },
                    "$or" : [
                        {
                            "$and" : [
                                {
                                    "Extension" : {
                                        "$in" : [
                                            "chrome_0",
                                            "firefox_0"
                                        ]
                                    }
                                },
                                {
                                    "LifeTime" : 0
                                }
                            ]
                        },
                        {
                            "LifeTime" : {
                                "$gt" : 1000
                            }
                        }
                    ]
                },
                "fields" : {
                    "Host" : 1,
                    "_id" : 0
                },
                "queryPlanner" : {
                    "plannerVersion" : 1,
                    "namespace" : "Test",
                    "indexFilterSet" : false,
                    "parsedQuery" : {
                        "$and" : [
                            {
                                "$or" : [
                                    {
                                        "$and" : [
                                            {
                                                "LifeTime" : {
                                                    "$eq" : 0
                                                }
                                            },
                                            {
                                                "Extension" : {
                                                    "$in" : [
                                                        "chrome_0",
                                                        "firefox_0"
                                                    ]
                                                }
                                            }
                                        ]
                                    },
                                    {
                                        "LifeTime" : {
                                            "$gt" : 1000
                                        }
                                    }
                                ]
                            },
                            {
                                "$not" : {
                                    "Host" : {
                                        "$eq" : "localhost"
                                    }
                                }
                            },
                            {
                                "VisitTime" : {
                                    "$lte" : "2015-07-25T18:30:00Z"
                                }
                            },
                            {
                                "AccountId" : {
                                    "$eq" : "accID"
                                }
                            },
                            {
                                "VisitTime" :"2014-07-25T18:30:00Z"

                            },
                            {
                                "SiteType" : {
                                    "$exists" : true
                                }
                            }
                        ]
                    },
                    "winningPlan" : {
                        "stage" : "FETCH",
                        "filter" : {
                            "$and" : [
                                {
                                    "SiteType" : {
                                        "$exists" : true
                                    }
                                },
                                {
                                    "$or" : [
                                        {
                                            "$and" : [
                                                {
                                                    "LifeTime" : {
                                                        "$eq" : 0
                                                    }
                                                },
                                                {
                                                    "Extension" : {
                                                        "$in" : [
                                                            "chrome_0",
                                                            "firefox_0"
                                                        ]
                                                    }
                                                }
                                            ]
                                        },
                                        {
                                            "LifeTime" : {
                                                "$gt" : 1000
                                            }
                                        }
                                    ]
                                },
                                {
                                    "VisitTime" : {
                                        "$lte" : "2015-07-25T18:30:00Z"
                                    }
                                },
                                {
                                    "AccountId" : {
                                        "$eq" : "accID"
                                    }
                                },
                                {
                                    "VisitTime" : {
                                        "$gte" : "2014-07-25T18:30:00Z"
                                    }
                                }
                            ]
                        },
                        "inputStage" : {
                            "stage" : "IXSCAN",
                            "keyPattern" : {
                                "Host" : -1,
                                "SiteType" : -1
                            },
                            "indexName" : "Host_-1_SiteType_-1",
                            "isMultiKey" : false,
                            "direction" : "forward",
                            "indexBounds" : {
                                "Host" : [
                                    "[MaxKey, \"localhost\")",
                                    "(\"localhost\", MinKey]"
                                ],
                                "SiteType" : [
                                    "[MaxKey, MinKey]"
                                ]
                            }
                        }
                    },
                    "rejectedPlans" : [
                        {
                            "stage" : "FETCH",
                            "filter" : {
                                "$and" : [
                                    {
                                        "SiteType" : {
                                            "$exists" : true
                                        }
                                    },
                                    {
                                        "$or" : [
                                            {
                                                "$and" : [
                                                    {
                                                        "LifeTime" : {
                                                            "$eq" : 0
                                                        }
                                                    },
                                                    {
                                                        "Extension" : {
                                                            "$in" : [
                                                                "chrome_0",
                                                                "firefox_0"
                                                            ]
                                                        }
                                                    }
                                                ]
                                            },
                                            {
                                                "LifeTime" : {
                                                    "$gt" : 1000
                                                }
                                            }
                                        ]
                                    }
                                ]
                            },
                            "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                    "VisitTime" : -1,
                                    "AccountId" : -1,
                                    "Host" : -1,
                                    "SiteType" : -1,
                                    "Extension" : -1,
                                    "LifeTime" : -1
                                },
                                "indexName" : "VisitTime_-1_AccountId_-1_Host_-1_SiteType_-1_Extension_-1_LifeTime_-1",
                                "isMultiKey" : false,
                                "direction" : "forward",
                                "indexBounds" : {
                                    "VisitTime" : [
                                        "[new Date(1437849000000), new Date(1406313000000)]"
                                    ],
                                    "AccountId" : [
                                        "[\"accID\", \"accID\"]"
                                    ],
                                    "Host" : [
                                        "[MaxKey, \"localhost\")",
                                        "(\"localhost\", MinKey]"
                                    ],
                                    "SiteType" : [
                                        "[MaxKey, MinKey]"
                                    ],
                                    "Extension" : [
                                        "[MaxKey, MinKey]"
                                    ],
                                    "LifeTime" : [
                                        "[MaxKey, MinKey]"
                                    ]
                                }
                            }
                        }
                    ]
                }
            }
        },
        {
            "$group" : {
                "_id" : "$Host",
                "Count" : {
                    "$sum" : {
                        "$const" : 1
                    }
                }
            }
        },
        {
            "$sort" : {
                "sortKey" : {
                    "Count" : -1
                },
                "limit" : 5
            }
        },
        {
            "$project" : {
                "_id" : false,
                "Host" : "$_id",
                "TotalVisit" : "$Count"
            }
        }
    ],
    "ok" : 1
}

解决方案

Index definition can be very subjective, and not something you just idly say "index this stuff" and then hope for the best. It actually requires some thought about the search process to which it applies.

Your query here appears to be made up of these main elements, which are mostly the "Account" and "Lifetime" values. Sure there are other things in there like the "VisitTime" notably, but taking the old library and card index analogy then think about the process.

So when you walk through the library door you are presented with two card index systems:

  1. Contains the books in the libary by the date they were authored, allowing you to get a selection of the cards pointing to the books based on the date

  2. Contains the names of the authors of the books and there locations in the library.

Now considering that you know you want to look for books from an author written in the last 10 years, then which index system do you pick? So do you look through the dates of 10 years and look for the author contained within? Or do you rather first look up the author, and then narrow down to which books have been written in the last 10 years?

Chances are that the last 10 years has a lot more content that just that from a single author. Therefore 2 is the better choice because once you have all books for that author, then going through the cards to find those within 10 years should be a much smaller task.

This is why the order of keys in an index is important to the query patterns you are using. Clearly "Account" should be the thing that narrows the selection the most and then other details to help further narrow that down.

Anything that puts something like a "VisitTime" before that, means you need to sift through all of the things you likely don't want within that period before you actually get to the things you need.

Ordering is important, and you need to always consider that with index design.

这篇关于如何在MongoDB聚合查询中使用$ hint?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆