展平mongoDB模式 [英] Flattening mongoDB schema

查看:70
本文介绍了展平mongoDB模式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个现有的深层嵌套的mongoDB模式,因为我有一个复杂的查询,而该查询无法使用当前结构有效地进行处理,因此必须对其进行展平.这是该模式的MWE:

I have an existing deeply nested mongoDB schema that I must flatten as I have a complex query that cannot be efficiently made with the current structure. Here is the MWE of the schema:

db.test.insert({
    "_id" : ObjectId("58e574a768afb6085ec3a388"),
    "details" : [
            {
                "_id" : ObjectId("58e55f0f68afb6085ec3a2cc"),
                "a" : [
                    {
                        "unit" : "08",
                        "size" : "5",
                        "pos" : "Far",
                        "_id" : ObjectId("58e55f0f68afb6085ec3a2d0")
                    }
                ],
                "b" : [
                    {
                        "unit" : "08",
                        "size" : "5",
                        "pos" : "Far",
                        "_id" : ObjectId("58e55f0f68afb6085ec3a2cd")
                    }
                ],
                "c" : [
                    {
                        "unit" : "08",
                        "size" : "3",
                        "pos" : "Far",
                        "_id" : ObjectId("58e55f0f68afb6085ec3a2ce")
                    }
                ],
                "d" : [
                    {
                        "unit" : "08",
                        "size" : "5",
                        "pos" : "Far",
                        "_id" : ObjectId("58e55f0f68afb6085ec3a2cf")
                    }
                ]
            }
        ]
    })

我想整理模式.理想的结果是这样的:

I want to flatten out the schema. The desired result is this:

"_id" : ObjectId("58e574a768afb6085ec3a388"),
"tests" : [
        {
            "_id" : ObjectId("58e542fb68afb6085ec3a1d2"),
            "aUnit" : "08",
            "aSize" : "5",
            "aPos" : "Far",
            "bPos" : "Far",
            "bSize" : "5",
            "bUnit" : "08",
            "cPos" : "Far",
            "cSize" : "3",
            "cUnit" : "08",
            "dPos" : "Far",
            "dSize" : "5",
            "dUnit" : "08"
                }
            ]

我愿意一次输入一种输入类型,我认为我有一种方法可以执行此操作,但是它不起作用.这是我尝试过的:

I'm willing to do each entry type, one at a time, and I thought I had a method to do so but it is not working. Here is what I tried:

db.test.find({"tests.$.details.a.unit":{$exists:true}}).forEach(function(doc) {      
    doc.tests = {aUnit:tests.details.a.unit};
    delete tests.details.a.unit;
    db.test.save(doc);
    });

但是,这没有任何改变.如何改善查询以扁平化架构?

However, this changes nothing. How can I improve my query in order to flatten my schema?

我意识到与我打算使用的MWE相比,MWE有一个小错误.我正在关闭每个条目.例如,"a" : [{ ... }],被错误地写为{"a" : [{ ... }]},.但是,现在已更新.

EDITED: I realized that the MWE had a minor error as compared to the one I intended to use it on. I was closing each entry. For example, "a" : [{ ... }], was incorrectly written as {"a" : [{ ... }]},. However, it is now updated.

推荐答案

新响应

打印数据

db.test.find().forEach(doc => {
  doc.details = doc.details.map( detail => {
    Object.keys(detail).filter( k => k !== "_id" ).forEach( k => {
      detail[k].forEach( item => {
        Object.keys(item).filter(i => i !== "_id" ).forEach( inner => {
          detail[k + inner.charAt(0).toUpperCase() + inner.substr(1)]
            = item[inner];
        })
      });
      delete detail[k];
    });
    return detail;
  });
  printjson(doc);
});

更新数据

db.test.find().forEach(doc => {
  doc.details = doc.details.map( detail => {
    Object.keys(detail).filter( k => k !== "_id" ).forEach( k => {
      detail[k].forEach( item => {
        Object.keys(item).filter(i => i !== "_id" ).forEach( inner => {
          detail[k + inner.charAt(0).toUpperCase() + inner.substr(1)]
            = item[inner];
        })
      });
      delete detail[k];
    });
    return detail;
  });

  ops = [
    ...ops,
    { "updateOne": {
      "filter": { "_id": doc._id },
      "update": { "$set": { "doc.details": doc.details } }
    }}
  ];

  if ( ops.length >= 500 ) {
    db.test.bulkWrite(ops);
    ops = [];
  }
});

if ( ops.length > 0 ) {
  db.test.bulkWrite(ops);
  ops = [];
}

输出表格

{
    "_id" : ObjectId("58e574a768afb6085ec3a388"),
    "details" : [
        {
          "_id" : ObjectId("58e55f0f68afb6085ec3a2cc"),
          "aUnit" : "08",
          "aSize" : "5",
          "aPos" : "Far",
          "bUnit" : "08",
          "bSize" : "5",
          "bPos" : "Far",
          "cUnit" : "08",
          "cSize" : "3",
          "cPos" : "Far",
          "dUnit" : "08",
          "dSize" : "5",
          "dPos" : "Far"
        }
    ]
}


原始数据

{
    "_id" : ObjectId("58e574a768afb6085ec3a388"),
    "tests" : [
      {
        "_id" : ObjectId("58e542fb68afb6085ec3a1d2"),
        "details" : [
          {
            "a" : [
              {
                "unit" : "08",
                "size" : "5",
                "pos" : "Far",
                "_id" : ObjectId("58e542fb68afb6085ec3a1d6")
              }
            ]
          },
          {
            "b" : [
              {
                "pos" : "Drive Side Far",
                "size" : "5",
                "unit" : "08",
                "_id" : ObjectId("58e542fb68afb6085ec3a1d3")
              }
            ]
          },
          {
            "c" : [
              {
                "pos" : "Far",
                "size" : "3",
                "unit" : "08",
                "_id" : ObjectId("58e542fb68afb6085ec3a1d4")
              }
            ]
          },
          {
            "d" : [
              {
                "pos" : "Far",
                "size" : "5",
                "unit" : "08",
                "_id" : ObjectId("58e542fb68afb6085ec3a1d5")
              }
            ]
          }
        ]
      }
    ]
}


原始答案

如果您尝试更新"数据,则涉及的内容比您尝试的要多得多.您有几个数组,实际上需要遍历"数组元素,而不是尝试直接访问它们.


Original Answer

If you are trying "update" your data, then it's a lot more involved than what you are trying. You have several arrays and you need to actually "traverse" the array elements rather than trying to access them directly.

这里仅是一个打印"平整"数据的示例:

Here's just a sample to "print out" the "flattened" data:

db.test.find().forEach(doc => {
  doc.tests = doc.tests.map( test => {
    test.details.forEach( detail => {
      Object.keys(detail).forEach( key => {
        detail[key].forEach( item => {
          Object.keys(item).forEach( inner => {
            if ( inner !== '_id' ) {
              test[key + inner.charAt(0).toUpperCase() + inner.substr(1)]
                = item[inner];
            }
          });
        });
      });
    });
    delete test.details;
    return test;
  });
  printjson(doc);
})

我相信您所希望的结构如下:

Which I believe gives the structure you are looking for:

{
    "_id" : ObjectId("58e574a768afb6085ec3a388"),
    "tests" : [
        {
            "_id" : ObjectId("58e542fb68afb6085ec3a1d2"),
            "aUnit" : "08",
            "aSize" : "5",
            "aPos" : "Far",
            "bPos" : "Drive Side Far",
            "bSize" : "5",
            "bUnit" : "08",
            "cPos" : "Far",
            "cSize" : "3",
            "cUnit" : "08",
            "dPos" : "Far",
            "dSize" : "5",
            "dUnit" : "08"
        }
    ]

}

现在,我不考虑在您的"details"数组中具有"a"等键的文档可能出现多次的任何可能性.因此,我只是考虑其中仅包含1个文档,其中包含"a""b"等,并且在将新关键字添加到该关键字的顶层时,总是会分配与该关键字匹配的最后找到的值. "details"文档.

Now I'm not taking into account any possibility that inside your "details" array the documents with keys like "a" etc could maybe appear multiple times. So I am just considering that there is only ever 1 document inside there which has a an "a" or a "b" etc, and the last found value matching that key is always assigned when adding the new keys to the top level of the "details" documents.

如果实际情况有所不同,则需要修改其中的各种.forEach()循环,以也使用索引"作为参数并将该索引值作为键名的一部分包含在内.即:

If you're actual case varies, then you would need to modify various .forEach() loops inside there to also use the "index" as a parameter and include that index value as part of the key name. i.e:

"a0Unit": "08",
"a0Size": "05",
"a1Unit": "09",
"a1Size": "06"

但是,如果需要的话,这是一个细节,因为这与问题中数据的显示方式不同.

But that is a detail you will have to work out if necessary since this would differ from how the data is presented in the question.

但是,如果这非常适合您要更新的内容,则只需使用

If however this is a perfect fit for what you want to update to, then simply run the loop with .bulkWrite() statements executing at regular intervals:

let ops = [];

db.test.find().forEach(doc => {
  doc.tests = doc.tests.map( test => {
    test.details.forEach( detail => {
      Object.keys(detail).forEach( key => {
        detail[key].forEach( item => {
          Object.keys(item).forEach( inner => {
            if ( inner !== '_id' ) {
              test[key + inner.charAt(0).toUpperCase() + inner.substr(1)]
                = item[inner];
            }
          });
        });
      });
    });
    delete test.details;
    return test;
  });

  ops = [
    ...ops,
    { "updateOne": {
      "filter": { "_id": doc._id },
      "update": { "$set": { "tests": doc.tests } }
    }}
  ];

  if ( ops.length >= 500 ) {
    db.test.bulkWrite(ops);
    ops = [];
  }
});

if ( ops.length > 0 ) {
  db.test.bulkWrite(ops);
  ops = [];
}


它也从您使用猫鼬的每个数组成员文档中的_id字段中出现.因此,无论您做什么,都不要尝试使用猫鼬本身来运行代码.这是数据的一次性"批量更新,应直接从Shell运行.然后,您当然需要修改架构以适应新的结构.


It also appears from the _id fields present in each array member document that you are using mongoose. So whatever you do, do not try and run the code using mongoose itself. It's a "one off" bulk update of your data and should be run directly from the shell. Then of course you will need to modify your schema to suit the new structure.

但这就是为什么您应该首先使用printjson()方法在Shell中运行数据的原因.

But this is why you should run through your data in the shell with the printjson() method first.

这篇关于展平mongoDB模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆