增长文档时MongoDB中的碎片 [英] Fragmentation in MongoDB when growing documents

查看:77
本文介绍了增长文档时MongoDB中的碎片的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎有评论的博客是使用MongoDB时描述不同建模策略的标准示例。

Seems like a blog with comments is the standard example used for describing different modeling strategies when using MongoDB.

我的问题与将评论建模为单个博客文章文档上的子集合(即,一个文档存储与单个博客文章相关的所有内容)。

My question relates to the model where comments are modeled as a sub collection on a single blog post document (i.e one document stores everything related to a single blog post).

如果同时进行多次写入,则似乎可以避免使用upserts和目标更新修饰符(如push)覆盖以前的更新。这意味着,为每个添加的注释保存文档不会覆盖以前添加的注释。
但是,碎片在这里如何发挥作用?假设随着时间的推移添加多个注释会导致内存碎片化并可能降低查询速度,这是否现实?
是否有通过子集合来增长文档的准则?

In the case of multiple simultaneous writes it seems like you would avoid overwriting previous updates if you use upserts and targeted update modifiers (like push). Meaning, saving the document for every comment added would not overwrite previously added comments. However, how does fragmentation come into play here? Is it realistic to assume that adding multiple comments over time will result in fragmented memory and potentially slower queries? Are there any guidelines for growing a document through sub collections?

我也知道每个文档16MB的限制,但是在我看来,这只是理论上的限制限制为16 MB,因为这将是大量文本。
如果出现碎片,下次mongo实例重新启动并将数据库读回内存时,文档会被压缩吗?

I am also aware of the 16MB limit per document, but that to me seems like a theoretical limit since 16 MB would be an enormous amount of text. In the event of fragmentation, would the documents be compacted the next time the mongo instance is restarted and reads the database back into memory?

我知道您希望与数据进行交互是有关如何对数据进行建模的最佳指导原则(需要没有博客文章父级的评论等)。但是,我有兴趣了解高度非规范化的单文档方法的潜在问题。在给定的博客文章示例中,我要描述的问题甚至是现实的吗?

I know the way you expect to interact with the data is the best guiding principle for how to model the data (needing comments without the blog post parent etc). However I am interested in learning about potential issues with the highly denormalized single document approach. Are the issues I'm describing even realistic in the given blog post example?

推荐答案

在回答您的问题之前,我先解释一下MongoDB的存储机制大约。

Before answer your questions, I try to explain the storage mechanics of MongoDB approximately.


  • 对于某些数据库 test ,您可以看到某些文件,例如 test.0 ,test.1,... ,因此 DATABASE = [FILE,...]

  • FILE = [ EXTENT,...]

  • EXTENT = [RECORD,...]

  • 记录=标题+文档+填充

  • 标题=大小+偏移+ PREV_RECORD_POINTER + NEXT_RECORD_POINTER +标记+ ...

  • For a certain database test, you can see some files like test.0, test.1, ..., so DATABASE = [FILE, ...]
  • FILE = [EXTENT, ...]
  • EXTENT = [RECORD, ...]
  • RECORD = HEADER + DOCUMENT + PADDING
  • HEADER = SIZE + OFFSET + PREV_RECORD_POINTER + NEXT_RECORD_POINTER + FLAG + ...

此链接供您参考

现在,我尝试尽可能地回答您的一些问题。

Now I try to answer some of your questions as possile as I can.


  1. 如何分片?

    当前记录不足以存储更新的文档,然后生成一个具有将更新后的文档存储到足够新的空间并删除原始记录的行为。删除的记录变成一个片段。

  1. How does fragmentation come to paly?
    It happens when the current record is not enough to store the updated document, then produce a migration with behaviors of storing the updated document into a new enough space and delete the original record. The deleted record turns out a fragment.

这会导致内存碎片和查询速度降低吗?

会出现内存碎片。但这不会导致查询速度变慢,除非最终没有足够的内存来分配。

Will it result in fragmented memory and potentially slower queries?
Fragmented memory will occur. But it won't cause slower queries unless not enough memory to allocate eventually.

但是,如果新文档可以放入,删除的记录可以重新使用。以下是一个简单的可靠证明。

(请注意所提交的偏移量

However, the deleted record can be reused if the new coming document can fit into it. Below is a simple solid proof.
(Pay attention to the filed offset)

> db.a.insert([{_id:1},{_id:2},{_id:3}]);
BulkWriteResult({
        "writeErrors" : [ ],
        "writeConcernErrors" : [ ],
        "nInserted" : 3,
        "nUpserted" : 0,
        "nMatched" : 0,
        "nModified" : 0,
        "nRemoved" : 0,
        "upserted" : [ ]
})
> db.a.find()
{ "_id" : 1 }
{ "_id" : 2 }
{ "_id" : 3 }
> db.a.find().showDiskLoc()
{ "_id" : 1, "$diskLoc" : { "file" : 0, "offset" : 106672 } }
{ "_id" : 2, "$diskLoc" : { "file" : 0, "offset" : 106736 } }   // the following operation will delete this document
{ "_id" : 3, "$diskLoc" : { "file" : 0, "offset" : 106800 } }
> db.a.update({_id:2},{$set:{arr:[1,2,3]}});
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.a.find().showDiskLoc()
{ "_id" : 1, "$diskLoc" : { "file" : 0, "offset" : 106672 } }
{ "_id" : 3, "$diskLoc" : { "file" : 0, "offset" : 106800 } }
{ "_id" : 2, "arr" : [ 1, 2, 3 ], "$diskLoc" : { "file" : 0, "offset" : 106864 } }  // migration happened
> db.a.insert({_id:4});
WriteResult({ "nInserted" : 1 })
> db.a.find().showDiskLoc()
{ "_id" : 1, "$diskLoc" : { "file" : 0, "offset" : 106672 } }
{ "_id" : 3, "$diskLoc" : { "file" : 0, "offset" : 106800 } }
{ "_id" : 2, "arr" : [ 1, 2, 3 ], "$diskLoc" : { "file" : 0, "offset" : 106864 } }
{ "_id" : 4, "$diskLoc" : { "file" : 0, "offset" : 106736 } }   // this space was taken up by {_id:2}, reused now.
>

这篇关于增长文档时MongoDB中的碎片的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆