评估MongoDB聚合查询的复杂性:$ lookup的成本 [英] Evaluating MongoDB aggregation query complexity: cost of $lookup

查看:996
本文介绍了评估MongoDB聚合查询的复杂性:$ lookup的成本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在评估涉及某些MongoDB聚合查询的算法的计算成本,因此,我试图找出我使用的各种运算符的成本,那么整个查询的成本将仅为所有这些都是级联应用的.

I'm evaluating the computational cost of my algorithm that involves some MongoDB aggregation queries, so I'm trying to figure out the costs of the various operators I use, then the cost of the whole query will be just the sum of all of them as they're applied in cascade.

我想说$ project,$ match和$ unwind的成本为O(n),n是集合中文档的数量,因为我没有任何索引,所以我需要扫描所有文件.

I came up saying that the cost of $project, $match and $unwind is O(n), with n being the number of documents in the collection, as I don't have any index so I need to scan all the documents.

现在我的问题是:新的$ lookup运算符的成本如何?它对两个集合执行左连接,因此我首先猜测它有点儿计算两个集合的笛卡尔积,因此代价应该是O(n * m),其中m是第二个集合的大小.我对吗? MongoDB会做些更有效的事情吗?您对此主题有参考吗?

Now my question is: what about the cost of the new $lookup operator? It performs a left join over two collections, so my first guess it that it kinda computes the cartesian product of the two collections, hence the cost should be something like O(n * m), where m is the size of the second collection. Am I right? Does MongoDB do something more efficient? Do you have any reference about this topic?

推荐答案

$lookup 实际上是针对引用的集合的$in查询,其中$in的值是从管道到查找的localField值的集合.

$lookup is effectively an $in query against the referenced collection, where the value of $in is the set of localField values from the pipeline to lookup.

如果对foreignField进行了索引,则该查询的复杂度为O(log(n)).如果未索引foreignField,则查询的复杂度为O(n).

If the foreignField is indexed, that query's complexity is O(log(n)). If the foreignField isn't indexed, the query's complexity is O(n).

这篇关于评估MongoDB聚合查询的复杂性:$ lookup的成本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆