MongoDB在聚合查询上的性能 [英] MongoDB's performance on aggregation queries

查看：369 发布时间：2020/5/10 22:09:50 performance mongodb

本文介绍了MongoDB在聚合查询上的性能的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在听到有关MongoDB性能的许多好处之后，我们决定尝试Mongodb解决我们遇到的问题.首先，将几个mysql数据库中的所有记录移至mongodb中的单个集合.这样就产生了具有 2900万个文档(每个文档至少包含20个字段)的集合，该文件在HD中占用了约100 GB的空间.我们决定将它们全部放在一个集合中，因为所有文档都具有相同的结构，并且我们要查询和汇总所有这些文档的结果.

After hearing so many good things about MongoDB's performance we decided to give Mongodb a try to solve a problem we have. I started by moving all the records we have in several mysql databases to a single collection in mongodb. This resulted in a collection with 29 Million documents (each one of them have at least 20 fields) which takes around 100 GB of space in the HD. We decided to put them all in one collection since all the documents have the same structure and we want to query and aggregate results on all those documents.

我创建了一些索引来匹配我的查询，否则即使是简单的count()也会花费很多时间.但是，诸如distinct()和group()之类的查询仍然花费太长时间.

I created some indexes to match my queries otherwise even a simple count() would take ages. However, queries such as distinct() and group() still take way too long.

示例:

// creation of a compound index    
db.collection.ensureIndex({'metadata.system':1, 'metadata.company':1})

// query to get all the combinations companies and systems
db.collection.group({key: { 'metadata.system':true, 'metadata.company':true }, reduce: function(obj,prev) {}, initial: {} });

我看了一下mongod日志，它有很多这样的行(在执行上面的查询时):

I took a look at the mongod log and it has a lot of lines like these (while executing the query above):

Thu Apr  8 14:40:05 getmore database.collection cid:973023491046432059 ntoreturn:0 query: {}  bytes:1048890 nreturned:417 154ms
Thu Apr  8 14:40:08 getmore database.collection cid:973023491046432059 ntoreturn:0 query: {}  bytes:1050205 nreturned:414 430ms
Thu Apr  8 14:40:18 getmore database.collection cid:973023491046432059 ntoreturn:0 query: {}  bytes:1049748 nreturned:201 130ms
Thu Apr  8 14:40:27 getmore database.collection cid:973023491046432059 ntoreturn:0 query: {}  bytes:1051925 nreturned:221 118ms
Thu Apr  8 14:40:30 getmore database.collection cid:973023491046432059 ntoreturn:0 query: {}  bytes:1053096 nreturned:250 164ms
...
Thu Apr  8 15:04:18 query database.$cmd ntoreturn:1 command  reslen:4130 1475894ms

此查询花费了1475894ms，这比我预期的要长得多(结果列表有大约60个条目).首先，鉴于我的收藏中有大量文件，这是否可以预期?一般而言，聚合查询在mongodb中是否会如此缓慢?关于如何提高性能的任何想法?

This query took 1475894ms which is way longer than what I would expect (the result list has around 60 entries). First of all, is this expected given the large number of documents in my collection? Are aggregation queries in general expected to be so slow in mongodb? Any thoughts on how can I improve the performance?

我在具有双核和10GB内存的单台计算机上运行mongod.

I am running mongod in a single machine with a dual core and 10GB of memory.

谢谢.

MongoDB在聚合查询上的性能 [英] MongoDB's performance on aggregation queries

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

MongoDB在聚合查询上的性能 [英] MongoDB&#39;s performance on aggregation queries

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

MongoDB在聚合查询上的性能 [英] MongoDB's performance on aggregation queries

登录关闭