大量文档的分片键 (MongoDB) [英] Sharding key (MongoDB) for large number documents

查看:67
本文介绍了大量文档的分片键 (MongoDB)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个 Web 应用程序,其中用户将向系统上传大量文档,并对文档执行不同类型的操作,包括聚合.然而,每个用户上传的文档数量差异很大——有的可能上传十几个文档,有的可能上传一百万个文档.

I am developing a web application where users will be uploading a large number of documents to the system and different types of operations will be performed on the documents, including aggregation. However the number of documents uploaded by each user varies widely - some might upload a dozen documents, and some might upload a million documents.

文档看起来像这样:

doc{
    _id: <self generated UUID>,
    uid: <id of user who uploaded the document>,
    ctime: <creation timestamp>,
    ....
        <other attributes, etc>
    ....
}

现在这里是选择shard key的问题:
1.如果我选择UUID作为shard key,同一个用户上传的文档不太可能在同一个shard中,聚合操作成本会很高.
2.如果我使用uid作为shard key,那么shards中存储的数据不会是偶数.

Now here is the problem in choosing the shard key:
1. If I choose the UUID as the shard key, documents uploaded by the same user are unlikely to end up in the same shard and aggregation operations will be costly.
2. If I use uid as the shard key then the data stored in shards will not be even.

谁能建议实现这一目标的最佳方法是什么?

Can anyone suggest which is the best way to achieve this?

我对分区和分片很陌生,我对谷歌和堆栈溢出的研究没有产生任何结果.由于项目仍处于设计阶段,因此我可以根据需要更改文档的架构.

I am very new to partitioning and sharding and my research on google as well as stack-overflow did not yield anything. I can change the schema of the documents if needed since the project is still at the design phase.

推荐答案

这是我见过的关于选择片键的最佳指南:http://www.kchodorow.com/blog/2011/01/04/how-选择一个碎片钥匙卡片游戏/

This is the best guide I've seen on choosing a shard key: http://www.kchodorow.com/blog/2011/01/04/how-to-choose-a-shard-key-the-card-game/

您必须决定如何查询数据.也许 uid 和 ctime 的组合会产生一个很好的分片键,但我不确定这是否会让您在查询时感到悲伤,因为您还没有对您计划如何查询提供太多了解.

You have to decide how you want to query the data. Perhaps a combination of uid and ctime will yield a good shard key, but I'm not sure if that will cause you grief while querying, as you haven't given much insight on how you plan to query.

这篇关于大量文档的分片键 (MongoDB)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆