性能调整MongoDB查询/更新? [英] Performance tuning MongoDB query/update?

查看:423
本文介绍了性能调整MongoDB查询/更新?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我有一个MongoDB实例,试图在其中使用另一个集合中的数据更新一个集合中的数据.这两个集合分别是participants约18万个文档和questions约95k文档.

So I have a MongoDB instance where I am trying to update data in one collection with data from another collection. The two collections are participants with about 180k documents and questions with about 95k documents.

participants中的文档通常看起来像这样:

Documents in participants typically look something like this:

{
    "_id" : ObjectId("52f90b8bbab16dd8594b82b4"),
    "answers" : [
        {
            "_id" : ObjectId("52f90b8bbab16dd8594b82b9"),
            "question_id" : 2081,
            "sub_id" : null,
            "values" : [
                "Yes"
            ]
        },
        {
            "_id" : ObjectId("52f90b8bbab16dd8594b82b8"),
            "question_id" : 2082,
            "sub_id" : 123,
            "values" : [
                "Would prefer to go alone"
            ]
        },
        {
            "_id" : ObjectId("52f90b8bbab16dd8594b82b7"),
            "question_id" : 2082,
            "sub_id" : 456,
            "values" : [
                "Yes"
            ]
        }
    ],
    "created" : ISODate("2012-03-01T17:40:21Z"),
    "email" : "anonymous",
    "id" : 65,
    "survey" : ObjectId("52f41d579af1ff4221399a7b"),
    "survey_id" : 374
}

我正在使用以下查询执行更新:

I am using the query below to perform the update:

db.participants.ensureIndex({"answers.question_id": 1, "answers.sub_id": 1});
print("created index for answer arrays!")
db.questions.find().forEach(function(doc){
    db.participants.update(
        {
            "answers.question_id": doc.id,
            "answers.sub_id": doc.sub_id
        },
        {
            $set:
            {
                "answers.$.question": doc._id
            }

        },
        false,
        true
    );
});
db.participants.dropIndex({"answers.question_id": 1, "answers.sub_id": 1});

但这大约需要20分钟才能运行.我希望添加索引对性能有帮助,但是仍然很慢.考虑到我正在索引对象数组中的字段,此索引设置是否正确?谁能看到我正在做的任何事情都会导致速度变慢?有关从何处着手以改善此查询的性能的建议?

But this takes about 20 minutes to run. I was hoping that adding the index would help with the performance, but it is still pretty slow. Is this index setup correctly considering that I am indexing fields in an array of objects? Can anyone see anything that I am doing that would cause the slowness? Suggestions on where to start looking to improve the performance of this query?

推荐答案

如果有人感兴趣,我可以在选择时通过使用投影将这个更新查询的运行时间从20分钟降低到大约一分半钟. questions文档.由于我仅使用_ididsub_id字段,因此可以执行以下操作:

In case anyone is interested I was able to take the run time of this update query from 20 minutes down to about a minute and a half by using projection when selecting the questions documents. Since I am only using the _id, id and sub_id fields I was able to do the following:

db.questions.find({},{_id: 1, id: 1, sub_id: 1}).forEach(function(doc){
    ....

极大地提高了性能.希望这对某人有帮助!

Which drastically improved performance. Hope this helps someone!

这篇关于性能调整MongoDB查询/更新?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆