MongoDB:如果使用$ addToSet或$ push,是否应该预分配文档? [英] MongoDB: Should You Pre-Allocate a Document if Using $addToSet or $push?

查看:92
本文介绍了MongoDB:如果使用$ addToSet或$ push,是否应该预分配文档?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在研究MongoDB,我了解强烈建议在插入点完全构建(预分配)文档结构,这样以后对该文档进行的更改就不需要该文档在磁盘上移动.使用$ addToSet或$ push时是否适用?

I've been studying up on MongoDB and I understand that it is highly recommended that documents structures are completely built-out (pre-allocated) at the point of insert, this way future changes to that document do not require the document to be moved around on the disk. Does this apply when using $addToSet or $push?

例如,说我有以下文档:

For example, say I have the following document:

"_id" : "rsMH4GxtduZZfxQrC",
"createdAt" : ISODate("2015-03-01T12:08:23.007Z"),
"market" : "LTC_CNY",
"type" : "recentTrades",
"data" : [ 
    {
        "date" : "1422168530",
        "price" : 13.8,
        "amount" : 0.203,
        "tid" : "2435402",
        "type" : "buy"
    }, 
    {
        "date" : "1422168529",
        "price" : 13.8,
        "amount" : 0.594,
        "tid" : "2435401",
        "type" : "buy"
    }, 
    {
        "date" : "1422168529",
        "price" : 13.79,
        "amount" : 0.594,
        "tid" : "2435400",
        "type" : "buy"
    }
]

我正在使用以下命令之一将新的对象数组(newData)添加到data字段:

And I am using one of the following commands to add a new array of objects (newData) to the data field:

$ addToSet以添加到数组的末尾:

$addToSet to add to the end of the array:

Collection.update(
  { _id: 'rsMH4GxtduZZfxQrC' },
  {
    $addToSet: {
      data: {
        $each: newData
      }
    }
  }
);

$ push(带有$ position)添加到数组的前面:

$push (with $position) to add to the front of the array:

Collection.update(
  { _id: 'rsMH4GxtduZZfxQrC' },
  {
    $push: {
      data: {
        $each: newData,
        $position: 0
      }
    }
  }
);

由于从newData添加的新对象,文档中的data数组将增加.那么,这种类型的文档更新会导致文档在磁盘上四处移动吗?

The data array in the document will grow due to new objects that were added from newData. So will this type of document update cause the document to be moved around on the disk?

对于这个特定的系统,这些文档中的data数组可以增长到最多75k个对象,因此,如果确实在每次$ addToSet或$ push更新之后这些文档确实在磁盘上移动,则应该定义该文档插入时有75k空值(data: [null,null...null]),然后也许使用$ set随着时间的推移替换值?谢谢!

For this particular system, the data array in these documents can grow to upwards of 75k objects within, so if these documents are indeed being moved around on disk after every $addToSet or $push update, should the document be defined with 75k nulls (data: [null,null...null]) on insert, and then perhaps use $set to replace the values over time? Thanks!

推荐答案

我了解强烈建议在插入点完全构建(预分配)文档结构,这样,将来对该文档进行更改时,就不需要在磁盘上四处移动文档.使用$ addToSet或$ push时是否适用?

I understand that it is highly recommended that documents structures are completely built-out (pre-allocated) at the point of insert, this way future changes to that document do not require the document to be moved around on the disk. Does this apply when using $addToSet or $push?

建议该用例是否可行,通常是不可行的.时间序列数据是一个明显的例外.它实际上不适用于$addToSet$push,因为它们倾向于通过增加数组来增加文档的大小.

It's recommended if it's feasible for the use case, which it usually isn't. Time series data is a notable exception. It doesn't really apply with $addToSet and $push because they tend to increase the size of the document by growing an array.

这些文档中的数据数组最多可以增长到75,000个对象内

the data array in these documents can grow to upwards of 75k objects within

停止.您确定要拥有成千上万个条目的不断增长的阵列吗?您要查询要返回的特定条目吗?您要索引数组条目中的任何字段吗?您可能想重新考虑您的文档结构.也许您希望每个data条目都是一个单独的文档,并在每个文档中复制markettypecreatedAt之类的字段?您不必担心文档移动.

Stop. Are you sure you want constantly growing arrays with tens of thousands of entries? Are you going to query wanting specific entries back? Are you going to index any fields in the array entries? You probably want to rethink your document structure. Maybe you want each data entry to be a separate document with fields like market, type, createdAt replicated in each? You wouldn't be worrying about document moves.

为什么阵列会增加到75,000个条目?您可以在每个文档中减少输入吗?这是时间序列数据吗? ?能够使用mmap存储引擎预分配文档并进行就地更新是很棒的,但是对于每个用例来说,这都不可行,并且对MongoDB的良好表现也不是必需的.

Why will the array grow to 75K entries? Can you do less entries per document? Is this time series data? It's great to be able to preallocate documents and do in-place updates with the mmap storage engine, but it's not feasible for every use case and it's not a requirement for MongoDB to perform well.

是否应在插入时将文档定义为75k空值(数据:[null,null ... null]),然后使用$ set随时间替换值?

should the document be defined with 75k nulls (data: [null,null...null]) on insert, and then perhaps use $set to replace the values over time?

不,这不是真的有帮助.文档大小将基于数组中空值的BSON大小进行计算,因此,当您将null替换为另一种类型时,文档大小会增加,并且无论如何您都会得到文档重写.您将需要为对象预分配数组,并将所有字段的类型均设置为默认值,例如

No, this is not really helpful. The document size will be computed based on the BSON size of the null values in the array, so when you replace null with another type the size will increase and you'll get document rewrites anyway. You would need to preallocate the array with objects with all fields set to a default value for its type, e.g.

{
    "date" : ISODate("1970-01-01T00:00:00Z")    // use a date type instead of a string date
    "price" : 0,
    "amount" : 0,
    "tid" : "000000", // assuming 7 character code - strings icky for default preallocation
    "type" : "none"    // assuming it's "buy" or "sell", want a default as long as longest real values
}

这篇关于MongoDB:如果使用$ addToSet或$ push,是否应该预分配文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆