MongoDB 聚合似乎很慢 [英] MongoDB Aggregation seems very slow

查看:112
本文介绍了MongoDB 聚合似乎很慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 mongodb 实例,运行以下统计数据:

I have a mongodb instance running with the following stats:

{
    "db" : "s",
    "collections" : 4,
    "objects" : 1.23932e+008,
    "avgObjSize" : 239.9999891553412400,
    "dataSize" : 29743673136.0000000000000000,
    "storageSize" : 32916655936.0000000000000000,
    "numExtents" : 39,
    "indexes" : 3,
    "indexSize" : 7737839984.0000000000000000,
    "fileSize" : 45009076224.0000000000000000,
    "nsSizeMB" : 16,
    "dataFileVersion" : {
        "major" : 4,
        "minor" : 5
    },
    "extentFreeList" : {
        "num" : 0,
        "totalSize" : 0
    },
    "ok" : 1.0000000000000000
}

我正在尝试运行以下查询:

I'm trying to run the following query:

db.getCollection('tick_data').aggregate(
    [       
        {$group: {_id: "$ccy",min:{$first: "$date_time"},max:{$last: "$date_time"}}}

    ]
)

我在集合中设置了以下索引:

And I have the following index set-up in the collection:

{
    "ccy" : 1,
    "date_time" : 1
}

查询需要 510 秒才能运行,尽管集合相当大(约 1.2 亿个文档),但感觉速度非常慢.有没有一种简单的方法可以让我更快?

The query takes 510 seconds to run, which feels like it's extremely slow even though the collection is fairly large (~120 million documents). Is there a simple way for me to make this faster?

每个文档都有这样的结构:

Every document has the structure:

{
    "_id" : ObjectId("56095bd7b2fc3e36d8d6ed52"),
    "bid_volume" : "6.00",
    "date_time" : ISODate("2007-01-01T00:00:07.904Z"),
    "ccy" : "USDNOK",
    "bid" : 6.2271700000000001,
    "ask_volume" : "6.00",
    "ask" : 6.2357699999999996
}

解释结果:

{
    "stages" : [ 
        {
            "$cursor" : {
                "query" : {},
                "fields" : {
                    "ccy" : 1,
                    "date_time" : 1,
                    "_id" : 0
                },
                "plan" : {
                    "cursor" : "BasicCursor",
                    "isMultiKey" : false,
                    "scanAndOrder" : false,
                    "allPlans" : [ 
                        {
                            "cursor" : "BasicCursor",
                            "isMultiKey" : false,
                            "scanAndOrder" : false
                        }
                    ]
                }
            }
        }, 
        {
            "$group" : {
                "_id" : "$ccy",
                "min" : {
                    "$first" : "$date_time"
                },
                "max" : {
                    "$last" : "$date_time"
                }
            }
        }
    ],
    "ok" : 1.0000000000000000
}

谢谢

推荐答案

最后我写了一个需要 0.002 秒运行的函数.

In the end I wrote a function which takes 0.002 seconds to run.

function() {
    var results = {}
    var ccys = db.tick_data.distinct("ccy");
    ccys.forEach(function(ccy)
        {
            var max_results = []
            var min_results = []

            db.tick_data.find({"ccy":ccy},{"date_time":1,"_id":0}).sort({"date_time":1}).limit(1).forEach(function(v){min_results.push(v.date_time)})
            db.tick_data.find({"ccy":ccy},{"date_time":1,"_id":0}).sort({"date_time":-1}).limit(1).forEach(function(v){max_results.push(v.date_time)})

            var max = max_results[0]
            var min = min_results[0]
            results[ccy]={"max_date_time":max,"min_date_time":min}
        }
    )
    return results
}

这篇关于MongoDB 聚合似乎很慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆