MongoDB数据架构性能 [英] MongoDB data schema performance

查看:86
本文介绍了MongoDB数据架构性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解MongoDB文档中数组和散列(据我所知是通过数组实现的)的内部分配和放置.

I am trying to understand the internal allocation and placement of arrays and hashes (which, from my understanding are implemented through arrays) in MongoDB documents.

在我们的域中,我们的文档中包含成千上万个键值对的逻辑分组,逻辑分组的深度最大为5-6层(请考虑嵌套的散列).

In our domain we have documents with anywhere between thousands and hundreds of thousands of key-value pairs in logical groupings up to 5-6 levels deeps (think nested hashes).

我们在键中用点表示嵌套,例如x.y.z,将其插入MongoDB后会自动变为:

We represent the nesting in the keys with a dot, e.g., x.y.z, which upon insertion into MongoDB will automatically become something like:

{
    "_id" : "whatever",
    "x" : {
        "y" : {
            "z" : 5
        }
    }
}

最常见的操作是增加一个值,这是我们使用原子$inc进行的,通常一次使用一个更新命令一次就增加1000个以上的值.新密钥会随着时间的推移而添加,但不经常(例如每天100次)添加.

The most common operation is incrementing a value, which we do with an atomic $inc, usually 1000+ values at a time with a single update command. New keys are added over time but not frequently, say, 100 times/day.

在我看来,另一种表示形式是不使用名称中的点,而是使用其他定界符并创建一个平面文档,例如

It occurred to me that an alternative representation would be to not use dots in names but some other delimiter and create a flat document, e.g.,

{
    "_id" : "whatever",
    "x-y-z" : 5
}

鉴于键值对的数量和使用方式(根据$inc更新和新键插入),我正在寻求有关以下两种方法之间的权衡的指南:

Given the number of key-value pairs and the usage pattern in terms of $inc updates and new key insertion, I am looking for guidance on the trade-offs between the two approaches in terms of:

  • 磁盘上的空间开销

  • space overhead on disk

$inc更新的性能

新钥匙插入的性能

推荐答案

MongoDB中磁盘上的文档存储为BSON格式.这里有BSON格式的详细说明: - http://bsonspec.org/#/specification

The on-disk storage of documents in MongoDB is in BSON format. There is a detailed description of the BSON format here: - http://bsonspec.org/#/specification

尽管使用短键名可以节省一些磁盘(因为您可以通过查看规范来了解,键名已嵌入文档中),但在我看来,几乎没有净差异就使用的磁盘空间而言,这两种设计之间存在差异-通过使用定界符(-)所使用的多余字节可以通过不必为单独的键值使用字符串终止符来回购.

While there is some disk savings from using short key names (since, as you can see by looking at the spec, the key name is embedded in the document), it looks to me like there'd be almost no net difference between the two designs in terms of on-disk space used -- the extra bytes you use by using the delimiters (-) get bought back by not having to have string terminators for the separate key values.

$ inc两种格式的更新应该花费几乎相同的时间,因为它们都是内存操作.与从磁盘读取文档所需的时间相比,内存更新时间的任何改进将是舍入错误最微小的地方.

$inc updates should take almost identical times with both formats, since they're both going to be in-memory operations. Any improvements in in-memory update time are going to be the tiniest of rounding errors compared to the time taken to read the document off of disk.

新钥匙插入物的性能也应基本相同.如果添加新的键/值对使新文档足够小以适合磁盘上的旧位置,则所有操作就是更新内存中的版本并写入日记条目.最终,内存版本将被写入磁盘.

The performance of new key inserts should also be virtually identical. If adding the new key/value pair leaves the new document small enough to fit in the old location on disk, then all that happens is the in-memory version is updated and a journal entry gets written. Eventually, the in-memory version will be written to disk.

如果文档的大小超出了先前为其分配的空间,则新的键插入会带来更多问题.在这种情况下,服务器必须将文档移至新位置并更新指向该文档的所有索引.通常这是一个较慢的操作,应该避免.但是,您正在讨论的架构更改不应影响文档移动的频率.同样,我认为这是洗手.

New key inserts are more problematic if the document grows beyond the space previously allocated for it. In that case, the server must move the document to a new location and update all indexes pointing to that document. This is generally a slower operation, and should be avoided However, the schema changes that you're discussing shouldn't affect the frequency of document movement. Again, I think this is a wash.

我的建议是使用最适合开发人员生产力的模式.如果您遇到性能问题,则可以提出单独的问题,以解决如何扩展系统或提高性能,或两者兼而有之.

My suggestion would be to use the schema that most lends itself to developer productivity. If you're having performance problems, then you can ask separate questions about how you can either scale your system or improve performance, or both.

这篇关于MongoDB数据架构性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆