查找聚合性能不佳 [英] Poor lookup aggregation performance

查看：61 发布时间：2020/5/10 22:30:29 mongodb aggregation-framework

本文介绍了查找聚合性能不佳的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个收藏夹

帖子:

{
    "_Id": "1",
    "_PostTypeId": "1",
    "_AcceptedAnswerId": "192",
    "_CreationDate": "2012-02-08T20:02:48.790",
    "_Score": "10",
    ...
    "_OwnerUserId": "6",
    ...
},
...

和用户:

{
    "_Id": "1",
    "_Reputation": "101",
    "_CreationDate": "2012-02-08T19:45:13.447",
    "_DisplayName": "Geoff Dalgas",
    ...
    "_AccountId": "2"
},
...

，我想找到撰写5到15个帖子的用户. 这是我的查询的样子:

and I want to find users who write between 5 and 15 posts. This is how my query looks like:

db.posts.aggregate([
    {
        $lookup: {
            from: "users", 
            localField: "_OwnerUserId",
            foreignField: "_AccountId", 
            as: "X"
        }
    },  
    {
        $group: {
            _id: "$X._AccountId", 
            posts: { $sum: 1 }
        }
    },   
    {
        $match : {posts: {$gte: 5, $lte: 15}}
    },  
    {
        $sort: {posts: -1 }
    },
    {
        $project : {posts: 1}
    }
])

，它的运行速度很慢.对于6k用户和10k帖子，花了40秒钟以上才能获得响应，而在关系数据库中，我很快就获得了响应. 哪里出问题了?我刚开始使用mongodb，很可能我弄乱了这个查询.

and it works terrible slow. For 6k users and 10k posts it tooks over 40 seconds to get response while in relational database I get response in a split second. Where's the problem? I'm just getting started with mongodb and it's quite possible that I messed up this query.

foreignField在from中的文档中指定字段收藏. $ lookup在foreignField上执行平等匹配，以输入文档中的localField.如果文档来自集合不包含foreignField，$ lookup对待值作为null进行匹配.

foreignField Specifies the field from the documents in the from collection. $lookup performs an equality match on the foreignField to the localField from the input documents. If a document in the from collection does not contain the foreignField, the $lookup treats the value as null for matching purposes.

这将与其他任何查询一样执行.

This will be performed the same as any other query.

如果您在_AccountId字段上没有索引，它将对10,000个帖子中的每一个进行全表扫描查询.大部分时间将花费在该表扫描中.

If you don't have an index on the field _AccountId, it will do a full tablescan query for each one of the 10,000 posts. The bulk of the time will be spent in that tablescan.

db.users.ensureIndex("_AccountId", 1)

加快了处理速度，因此它可以完成10,000次索引命中，而不是10,000次表扫描.

speeds up the process so it's doing 10,000 index hits instead of 10,000 table scans.

这篇关于查找聚合性能不佳的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

查找聚合性能不佳 [英] Poor lookup aggregation performance

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

查找聚合性能不佳 [英] Poor lookup aggregation performance

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭