在 WiredTiger 中对 MongoDb 文档执行部分更新是否比完整文档更新有任何优势? [英] Does performing a partial update on a MongoDb document in WiredTiger provide any advantage over a full document update?

查看:54
本文介绍了在 WiredTiger 中对 MongoDb 文档执行部分更新是否比完整文档更新有任何优势?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Java 驱动程序,尽管这个问题不是特定于语言的,但我将部分更新写入 mongodb 文档,因为使用 MMAPv1 存储引擎,文档是在原地(在内存中)编辑的,因此这提供了更好的性能.这确实增加了相当大的开发复杂性,因为我可以选择一次保存整个文档,而不必担心确切更新的细节.更新到 WiredTiger 后,我了解到这个较新的存储引擎不会就地(在内存中)编辑文档,而是为每次写入分配新的内存(不清楚这是否意味着文档的完整副本或只是差异).这是否意味着我执行完整文档写入与部分文档写入没有性能差异?

I'm using a Java driver, although this question is not language specific, to write partial updates to mongodb documents because using the MMAPv1 storage engine the documents are edited in place (in memory) so this provides better performance. This does add considerable development complexity as I could alternatively save the entire document at once and not worry about the details of what exactly got updated. After updating to WiredTiger I learned that this newer storage engine does not edit documents in place (in memory) but instead allocates new memory for each write (unclear if this means full copy of the document or just diff). Does this mean that it makes no performance difference whether I do a full document write vs a partial one?

推荐答案

更新到 WiredTiger 后,我了解到这个较新的存储引擎不会就地(在内存中)编辑文档,而是为每次写入分配新的内存(不清楚这是否意味着文档的完整副本或只是差异).

After updating to WiredTiger I learned that this newer storage engine does not edit documents in place (in memory) but instead allocates new memory for each write (unclear if this means full copy of the document or just diff).

WiredTiger 使用 多版本并发控制 (MVCC) 来维护生命周期内的多个数据视图的读者.WiredTiger 的内存格式与磁盘格式不同:它在内存中存储文档的差异,但作为定期检查点的一部分刷新到数据文件时,会构建文档的完整版本.

WiredTiger uses Multiversion Concurrency Control (MVCC) to maintain multiple views of data for the lifetime of readers. WiredTiger’s in-memory format is different from the on-disk format: in-memory it stores diffs to a document, but a full version of the document is constructed when flushed to the data files as part of periodic checkpoints.

这是否意味着我执行完整文档写入与部分文档写入没有性能差异?

Does this mean that it makes no performance difference whether I do a full document write vs a partial one?

不管不同的 MongoDB 存储引擎如何处理对磁盘的持久更新,在可能的情况下使用部分更新而不是完全更新仍然有性能优势(特别是如果您设置的字段值相对于整体文档大小而言较小).

Irrespective of how different MongoDB storage engines handle persisting updates to disk, there are still performance benefits in using partial updates rather than full updates where possible (particularly if you are setting field values which are small relative to overall document size).

例如,考虑:

  • 用于文档更新的网络流量(任何存储引擎)
  • 日志中条目的大小(任何存储引擎)
  • 复制操作日志(任何存储引擎)
  • 更新的内存版本大小 (WiredTiger)

如果您每次都发送完整的文档更新,您还会创建这样的场景,即使更改可能针对不同的字段集,更新到达服务器的顺序也很重要.您可以添加其他应用程序逻辑(例如乐观版本控制)以确保不会意外覆盖字段值,但这可能会根据您的用例增加不必要的复杂性.

If you are sending full document updates each time, you also create scenarios where the order that updates reach the server is significant even when changes might be for distinct field sets. You could add additional application logic such as optimistic versioning to ensure you don't accidentally overwrite field values, but this may add unnecessary complexity depending on your use case.

这篇关于在 WiredTiger 中对 MongoDb 文档执行部分更新是否比完整文档更新有任何优势?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆