MongoDB $unwind 和性能 [英] MongoDB $unwind and performance
问题描述
下面是我的数据的一个小例子:
<预><代码>[{"callId": "17dac51e-125e-499e-9064-f20bd3b1a9d8",来电者":{"firstName": "测试","lastName": "测试",电话号码":1231231234"},查询":[{"inquiryId": "b0d14381-ce75-49aa-a66a-c36ae20b72a8",路线历史":[{"assignedUserId": "cfa0ffe9-c77d-4eec-87d7-4430f7772e81","routeDate": "2020-01-01T06:00:00.000Z",状态":路由"},{"assignedUserId": "cfa0ffe9-c77d-4eec-87d7-4430f7772e81","routeDate": "2020-01-03T06:00:00.000Z",状态":路由"}]},{"inquiryId": "9d743be9-7613-46d7-8f9b-a04b4b899b56",路线历史":[{"assignedUserId": "cfa0ffe9-c77d-4eec-87d7-4430f7772e81","routeDate": "2020-01-01T06:00:00.000Z",状态":结束"},{"assignedUserId": "cfa0ffe9-c77d-4eec-87d7-4430f7772e81","routeDate": "2020-01-03T06:00:00.000Z",状态":结束"}]}]}]我正在针对更多文档运行以下聚合:
db.collection.aggregate([{$unwind: "$查询"},{$匹配:{"inquiries.routeHistory.status": "已结束"}},{$addFields:{inquiries.routeHistory":{$过滤器:{输入:$inquiries.routeHistory",条件: {$eq: [ { $max: "$inquiries.routeHistory.routeDate" }, "$$this.routeDate" ]}}}}},{$组:{_id: "$_id",callId: { $first: "$callId" },来电者:{ $first:$来电者"},查询:{ $push: "$inquiries" }}}])
虽然这确实返回了预期的结果,但它没有扩展.当我针对更大的数据集运行此程序时,出现超时.我在查询的字段上有索引.有没有办法优化我的查询以获得更好的性能?
重要说明:我仅限于使用 文档数据库
它不能扩展,因为如果 $match
可以应用索引(和其他人,看看 此处) 发生在管道的开头>
确保你有这个多键索引:{'inquiries.routeHistory.status' : 1}
db.collection.aggregate([{$匹配:{"inquiries.routeHistory.status": "已结束"}},{$unwind: "$查询"},{$匹配:{"inquiries.routeHistory.status": "已结束"}},{$addFields:{inquiries.routeHistory":{$过滤器:{输入:$inquiries.routeHistory",条件: {$eq: [{$max: "$inquiries.routeHistory.routeDate"},$$this.routeDate"]}}}}},{$组:{_id: "$_id",电话号码:{$first: "$callId"},来电者:{$first: "$来电者"},查询:{$push: "$查询"}}}])
注意:很遗憾,DocumentDB 不支持 $map
,在这种情况下,我们可以使用 2 $addFields
Below is a small example of my data:
[
{
"callId": "17dac51e-125e-499e-9064-f20bd3b1a9d8",
"caller": {
"firstName": "Test",
"lastName": "Testing",
"phoneNumber": "1231231234"
},
"inquiries": [
{
"inquiryId": "b0d14381-ce75-49aa-a66a-c36ae20b72a8",
"routeHistory": [
{
"assignedUserId": "cfa0ffe9-c77d-4eec-87d7-4430f7772e81",
"routeDate": "2020-01-01T06:00:00.000Z",
"status": "routed"
},
{
"assignedUserId": "cfa0ffe9-c77d-4eec-87d7-4430f7772e81",
"routeDate": "2020-01-03T06:00:00.000Z",
"status": "routed"
}
]
},
{
"inquiryId": "9d743be9-7613-46d7-8f9b-a04b4b899b56",
"routeHistory": [
{
"assignedUserId": "cfa0ffe9-c77d-4eec-87d7-4430f7772e81",
"routeDate": "2020-01-01T06:00:00.000Z",
"status": "ended"
},
{
"assignedUserId": "cfa0ffe9-c77d-4eec-87d7-4430f7772e81",
"routeDate": "2020-01-03T06:00:00.000Z",
"status": "ended"
}
]
}
]
}
]
I am running the following aggregate against many more documents:
db.collection.aggregate([
{
$unwind: "$inquiries"
},
{
$match: {
"inquiries.routeHistory.status": "ended"
}
},
{
$addFields: {
"inquiries.routeHistory": {
$filter: {
input: "$inquiries.routeHistory",
cond: {
$eq: [ { $max: "$inquiries.routeHistory.routeDate" }, "$$this.routeDate" ]
}
}
}
}
},
{
$group: {
_id: "$_id",
callId: { $first: "$callId" },
caller: { $first: "$caller" },
inquiries: { $push: "$inquiries" }
}
}
])
While this does return the expected results, it does not scale. When I run this against a much larger dataset, I'm getting timeouts. I have indexes on the fields I am querying against. Is there a way to optimize my query for better performance?
Important note: I am limited to only using operators supported by DocumentDB
It doesn't scale because indexes can be applied if $match
(and others, take a look here) occurs at the beginning of a pipeline
Make sure you have this multi-key index: {'inquiries.routeHistory.status' : 1}
db.collection.aggregate([
{
$match: {
"inquiries.routeHistory.status": "ended"
}
},
{
$unwind: "$inquiries"
},
{
$match: {
"inquiries.routeHistory.status": "ended"
}
},
{
$addFields: {
"inquiries.routeHistory": {
$filter: {
input: "$inquiries.routeHistory",
cond: {
$eq: [
{
$max: "$inquiries.routeHistory.routeDate"
},
"$$this.routeDate"
]
}
}
}
}
},
{
$group: {
_id: "$_id",
callId: {
$first: "$callId"
},
caller: {
$first: "$caller"
},
inquiries: {
$push: "$inquiries"
}
}
}
])
Note: It's sad, DocumentDB doesn't support $map
, in that case, we could solve it with 2 $addFields
这篇关于MongoDB $unwind 和性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!