需要有关 mongo 聚合查找的指导以计算每个节点的所有子节点中的子节点 [英] Need guidance on mongo aggregate lookup to count subnodes in all child nodes for each node

查看:32
本文介绍了需要有关 mongo 聚合查找的指导以计算每个节点的所有子节点中的子节点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了更好地理解我的问题,我创建了一个计算各级森林中叶子的假场景.

To better understand my problem, I've created a fake scenario of counting leaves in forests at all levels.

我需要提供所有森林、每个森林、每棵树和每个分支的叶子总数.因此,我需要输出一个树结构,其中包含所有这些信息以及每个元素的 ID,以便可以识别它们.

I need to provide total counts of leaves for all forests, for each forest, for each tree, and for each branch. Thus, I need to output a tree structure that has all this info along with the IDs of each element so they can be identified.

输入数据来自 2 个集合 - 森林和树叶.我需要在 branch_id 上加入森林和叶子,并输出叶子计数并将这些计数注入到与森林集合等效的结构中(作为查询结果 - 不存储在数据库中).内存使用是一个问题.我想我可以做一个连接并将所有这些 id 读入内存,但可能有多达 100 个森林.每个森林最多可能有 10 棵树,每棵树大约有 25 个分支,每个分支最多有 15 片叶子.

Input data comes from 2 collections - forests and leaves. I need to join the forests and leaves on branch_id, and output leaf counts and inject these counts into a structure equivalent to the forests collection (as a query result - not stored in the database). Memory usage is a concern. I was thinking I could do a join and read all these ids into memory, but there may be up to 100 forests. Each forest likely has up to 10 trees, each tree has about 25 branches, each branch has up to 15 leaves.

Forests collection
[
    {
        forest_id: 'forestA',
        trees: [
            {
                tree_id: 'treeA',
                branches: [
                    {
                         branch_id: 'branchA',
                    }
                ]
            }
        ]
    }
]

Leaves collection
[
    {
        leaf_id: 'leafA',
        branch_id: 'branchA'
    }
]

这是所需的输出:

{
    leaf_count: 9999999999,
    forests: [
       {
            leaf_count: 8888888,
            forest_id: 'forestA',
            trees: [
                {
                    leaf_count: 77777,
                    tree_id: 'treeA',
                    branches: [
                        {
                            leaf_count: 6666,
                            branch_id
                        }
                    ]
                }
            ]
        }
    ]
}

我正在处理的聚合管道中包含所有所需的输出(到目前为止已构建),没有任何计数.我想我需要在这里使用 facets,我担心性能,我正在寻找我应该学习的知识来正确解决这个问题.任何指导将不胜感激.谢谢!

The aggregate pipeline I'm working on has all the desired output (built up to this point) in it, without any of the counts. I'm thinking I need to use facets here, and I'm worried about performance and I'm looking for what I should learn about to tackle this properly. Any guidance would be much appreciated. Thanks!

推荐答案

this MongoPlaygroud 链接.

Get the test data from this MongoPlaygroud link.

注意:为了性能,您可以跳过以上链接中的所有 $sort 阶段.

NOTE: You can skip all $sort stages in the above link for the sake of performance.

查询叶子数:

db.Forests.aggregate([
    { $unwind: "$trees" },
    { $unwind: "$trees.branches" },
    {
        $lookup: {
            from: "Leaves",
            localField: "trees.branches.branch_id",
            foreignField: "branch_id",
            as: "trees.branches.leaves"
        }
    },
    {
        $addFields: {
            "trees.branches.leaf_count": { $size: "$trees.branches.leaves" }
        }
    },
    {
        $project: { "trees.branches.leaves": 0 }
    },
    {
        $group: {
            _id: {
                forest_id: "$forest_id",
                tree_id: "$trees.tree_id"
            },
            leaf_count: { $sum: "$trees.branches.leaf_count" },
            branches: { $push: "$trees.branches" }
        }
    },
    {
        $group: {
            _id: "$_id.forest_id",
            leaf_count: { $sum: "$leaf_count" },
            trees: {
                $push: {
                    leaf_count: { $sum: "$leaf_count" },
                    tree_id: "$_id.tree_id",
                    branches: "$branches"
                }
            }
        }
    },
    {
        $group: {
            _id: null,
            leaf_count: { $sum: "$leaf_count" },
            forests: {
                $push: {
                    leaf_count: { $sum: "$leaf_count" },
                    forest_id: "$_id",
                    trees: "$trees"
                }
            }
        }
    },
    {
        $project: { _id: 0 }
    }
])

输出:

{
    "leaf_count" : 4,
    "forests" : [
        {
            "leaf_count" : 3,
            "forest_id" : "forestA",
            "trees" : [
                {
                    "leaf_count" : 2,
                    "tree_id" : "treeA",
                    "branches" : [
                        {
                            "branch_id" : "branchA",
                            "leaf_count" : 1
                        },
                        {
                            "branch_id" : "branchA1",
                            "leaf_count" : 1
                        },
                        {
                            "branch_id" : "branchA2",
                            "leaf_count" : 0
                        }
                    ]
                },
                {
                    "leaf_count" : 1,
                    "tree_id" : "treeB",
                    "branches" : [
                        {
                            "branch_id" : "branchB",
                            "leaf_count" : 1
                        }
                    ]
                }
            ]
        },
        {
            "leaf_count" : 1,
            "forest_id" : "forestB",
            "trees" : [
                {
                    "leaf_count" : 1,
                    "tree_id" : "treeC",
                    "branches" : [
                        {
                            "branch_id" : "branchC",
                            "leaf_count" : 1
                        }
                    ]
                },
                {
                    "leaf_count" : 0,
                    "tree_id" : "treeD",
                    "branches" : [
                        {
                            "branch_id" : "branchD",
                            "leaf_count" : 0
                        }
                    ]
                }
            ]
        },
        {
            "leaf_count" : 0,
            "forest_id" : "forestC",
            "trees" : [
                {
                    "leaf_count" : 0,
                    "tree_id" : "treeE",
                    "branches" : [
                        {
                            "branch_id" : "branchE",
                            "leaf_count" : 0
                        }
                    ]
                }
            ]
        }
    ]
}

这篇关于需要有关 mongo 聚合查找的指导以计算每个节点的所有子节点中的子节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆