为什么键名存储在MongodDB的文档中 [英] Why are key names stored in the document in MongodDB

查看:48
本文介绍了为什么键名存储在MongodDB的文档中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对Kyle Banker的MongoDB In Action的引用感到好奇:

I'm curious about this quote from Kyle Banker's MongoDB In Action:

请务必考虑所选键名的长度,因为键名存储在文档本身中.这与RDBMS相反,在RDBMS中,列名始终与引用的行保持分开.因此,在使用BSON时,如果您可以使用dob代替date_of_birth作为键名,则每个文档可以节省10个字节.听起来可能并不多,但是一旦拥有十亿个这样的文档,仅使用较短的键名就可以节省近10 GB的存储空间.这并不意味着您应该花一些不合理的时间来确保键名很小;明智的.但是,如果您希望获得大量数据,那么节省键名将节省空间.

It’s important to consider the length of the key names you choose, since key names are stored in the documents themselves. This contrasts with an RDBMS, where column names are always kept separate from the rows they refer to. So when using BSON, if you can live with dob in place of date_of_birth as a key name, you’ll save 10 bytes per document. That may not sound like much, but once you have a billion such documents, you’ll have saved nearly 10 GB of storage space just by using a shorter key name. This doesn’t mean you should go to unreasonable lengths to ensure small key names; be sensible. But if you expect massive amounts of data, economizing on key names will save space.

我对在数据库服务器端未对此进行优化的原因感兴趣.在集合中包含所有键名称的内存中查找表是否会对性能造成太大的损失,而不值得节省空间?

I am interested in the reason why this is not optimized on the database server side. Would a in-memory lookup table with all key names in the collection be too much of a performance penalty that is not worth the potential space savings?

推荐答案

您所指的通常称为密钥压缩" *.尚未实施的原因有很多:

What you are referring to is often called "key compression"*. There are several reasons why it hasn't been implemented:

  1. 如果您希望完成此操作,则可以轻松地在Application/ORM/ODM级别上完成.
  2. 在所有情况下都不一定具有性能优势–请考虑具有很多键名和/或键名在文档之间差异很大的集合.
  3. 在拥有数百万个文档之前,它可能根本无法提供可衡量的性能**优势.
  4. 如果服务器这样做,则仍必须通过网络传输完整的密钥名称.
  5. 如果压缩的密钥名称是通过网络传输的,则使用JavaScript控制台会损害 really 的可读性.
  6. 压缩整个JSON文档可能的报价提供了更好的性能优势.
  1. If you want it done, you can currently do it at the Application/ORM/ODM level quite easily.
  2. It's not necessarily a performance** advantage in all cases — think collections with lots of key names, and/or key names that vary wildly between documents.
  3. It might not provide a measurable performance** advantage at all until you have millions of documents.
  4. If the server does it, the full key names still have to be transmitted over the network.
  5. If compressed key names are transmitted over the network, then readability really suffers using the javascript console.
  6. Compressing the entire JSON document might offer offers an even better performance advantage.

像所有功能一样,有一个实现它的成本效益分析,并且(至少到目前为止)其他功能提供了更多的实惠".

Like all features, there's a cost benefit analysis for implementing it, and (at least so far) other features have offered more "bang for the buck".

对于完整的压缩, [正在考虑] [1],用于将来的MongoDB版本.从3.0版开始可用(见下文)

Full document compression is [being considered][1] for a future MongoDB version. available as of version 3.0 (see below)

*内存中用于键名的查找表基本上是LZW样式压缩的一种特例-这或多或少是大多数压缩算法所做的.

* An in-memory lookup table for key names is basically a special case of LZW style compression — that's more or less what most compression algorithms do.

**压缩同时提供空间优势和性能优势.较小的文档意味着每个IO可以读取更多的文档,这意味着在具有固定IO的系统中,每秒可以读取更多的文档.

** Compression provides both a space advantage and a performance advantage. Smaller documents means that more documents can be read per IO, which means that in a system with fixed IO, more documents per second can be read.

MongoDB 3.0版及更高版本现在具有完整的文档压缩功能,其中 WiredTiger 存储引擎.

MongoDB versions 3.0 and up now have full document compression capability with the WiredTiger storage engine.

有两种压缩算法可用:快照

Two compression algorithms are available: snappy, and zlib. The intent is for snappy to be the best choice for all-around performance, and for zlib to be the best choice for maximum storage capacity.

在我的个人(非科学但与商业项目有关的实验)中,快速压缩(我们未评估zlib)提供了显着提高的存储密度,而没有明显的净性能成本.实际上,在某些情况下,性能略有改善,大致与我之前的评论/预测相符.

In my personal (non-scientific, but related to a commercial project) experimentation, snappy compression (we didn't evaluate zlib) offered significantly improved storage density at no noticeable net performance cost. In fact, there was slightly better performance in some cases, roughly in line with my previous comments/predictions.

这篇关于为什么键名存储在MongodDB的文档中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆