MongoDB 在单台机器上按日期分片 [英] MongoDB shard by date on a single machine

查看:201
本文介绍了MongoDB 在单台机器上按日期分片的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们从一个 mongodb 开始,但没有一个集合增长到约 300GB.该集合包含具有日期字段的对象.但大多数情况下,我们只需要查询最近的对象,然后再查询历史对象.所以我的问题是:是否可以通过日期字段在一台服务器上分片这个集合?更明确地说,我想将更新的对象分片到一个节点中,将旧的对象分片到另一个节点中.而不是将所有对象平均分布在 n 个分片上.

We have started with one single mongodb but no we have one collection grown to ~300GB. The collection contains objects which have a date field. But mostly we just need to query the more recent objects then the historic once. So my question is: is it possible to shard this collection on one server by a date field? More explicitly I would like to shard more recent objects into one node and older objects into another node. Instead of equally distributing all the objects on n shards.

是否有教程如何将现有的单个数据库(没有任何副本集)分片到分片集群中?

And is there a tutorial how one can shard an existing single database (without any replica sets) into a sharded cluster?

推荐答案

从技术上讲,您不需要分片您的内容,只需要索引您的字段.是的,您可以在日期字段上创建索引,它会受到尊重,您可以通过访问查询计划 db.collection.explain("executionStats")

Technically you don't need to shard your content and just need to index your field. Yes you can create index on date field and it would be respected which you can see by visiting the query plan db.collection.explain("executionStats")

但是,选择片键非常重要.选择片键时需要考虑的事情很少

However, Choosing a shard key is very important. there are few things to consider while choosing the shard key

- Write scaling (high cardinality, Randomization)
- Query Isolation. (read)

选择日期字段实际上提供了非常高的基数,但是它无法进行随机化,因此所有文档都存储在单个分片中,因此限制了系统的写入容量.出于同样的原因,不鼓励将 ObjectId 用作分片键.

choosing the date field actually gives a very high cardinality however it fails in doing the randomization and as a result all documents are stored into the single shard and hence it limits the write capacity of system. For the same reason ObjectId is discouraged to use as the shard key.

http://docs.mongodb.org/manual/core/sharding-shard-key/上面链接的内容.."MongoDB 在创建文档时生成 ObjectId 值以生成对象的唯一标识符.但是,该值中数据的最高有效位表示时间戳,这意味着它们以规则和可预测的模式递增.即使这个值具有很高的基数,当使用这个、任何日期或其他单调递增的数字作为shard key时,所有的insert操作都会将数据存储到一个chunk中,因此,一个shard.因此,写入容量为该分片将定义集群的有效写入容量."

这篇关于MongoDB 在单台机器上按日期分片的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆