即使使用索引,当结果集很大时 mongodb.countDocuments 也很慢 [英] mongodb.countDocuments is slow when result set is large even if index is used

查看:266
本文介绍了即使使用索引,当结果集很大时 mongodb.countDocuments 也很慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

mongodb.countDocuments 当结果集很大时很慢

关于用户收集的测试数据:

Test data on users collection:

  • 1000 万个文档,状态为 'active'
  • 100k 状态为 'inactive'
  • 的文档

字段status 被索引{status: 1}

The field status is indexed {status: 1}

db.users.countDocuments({status: 'active'}) 需要 2.91 秒db.users.countDocuments({status: 'inactive'}) 需要 0.018 秒

db.users.countDocuments({status: 'active'}) takes 2.91 sec db.users.countDocuments({status: 'inactive'}) takes 0.018 sec

我了解 countDocuments 使用聚合来查找和计算结果.

I understand that countDocuments uses an aggegation to find and count the results.

estimatedDocumentCount() 在这种情况下不起作用,因为需要查询过滤器

estimatedDocumentCount() does not work in this case because query filter is needed

有什么改进建议吗?

推荐答案

计数看起来是一种应该很便宜的东西,但往往不是.因为 mongo 不维护在其 b-tree 索引中匹配特定条件的文档数量的计数,所以它需要扫描索引计数文档.这意味着对文档进行 100 倍的计数将花费 100 倍的时间,这就是我们在这里看到的大致情况——0.018 * 100 = 1.8s.

Counting seems like one of those things that should be cheap, but often isn't. Because mongo doesn't maintain a count of the number of documents that match certain criteria in its b-tree index, it needs to scan through the index counting documents as it goes. That means that counting 100x the documents will take 100x the time, and this is roughly what we see here -- 0.018 * 100 = 1.8s.

要加快速度,您有几个选择:

To speed this up, you have a few options:

  1. 活动计数大致为estimatedDocumentCount() - db.users.countDocuments({status: 'inactive'}).对于您的用例,这是否足够准确?
  2. 或者,您可以在一个单独的集合中维护一个 counts 文档,该集合与您拥有的活动/非活动文档的数量保持同步.
  1. The active count is roughly estimatedDocumentCount() - db.users.countDocuments({status: 'inactive'}). Would this be accurate enough for your use case?
  2. Alternatively, you can maintain a counts document in a separate collection that you keep in sync with the number of active/inactive documents that you have.

这篇关于即使使用索引,当结果集很大时 mongodb.countDocuments 也很慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆