数据库分片与分区 [英] Database sharding vs partitioning

查看:161
本文介绍了数据库分片与分区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近我一直在阅读有关可伸缩体系结构的文章.在这种情况下,与数据库有关的两个词不断出现: sharding partitioning .我查阅了说明,但最终还是感到困惑.

I have been reading about scalable architectures recently. In that context, two words that keep on showing up with regards to databases are sharding and partitioning. I looked up descriptions but still ended up confused.

stackoverflow的专家可以帮助我弄好基础知识吗?

Could the experts at stackoverflow help me get the basics right?

  • 分片分区有什么区别?
  • '所有分片的数据库基本上都是分区的(在不同的节点上),但是所有分区的数据库不一定都是分片的'是真的吗?
  • What is the difference between sharding and partitioning ?
  • Is it true that 'all sharded databases are essentially partitioned (over different nodes), but all partitioned databases are not necessarily sharded' ?

推荐答案

分区是用于跨表或数据库划分数据的通用术语.分片是一种特定的分区类型,属于水平分区的一部分.

Partitioning is more a generic term for dividing data across tables or databases. Sharding is one specific type of partitioning, part of what is called horizontal partitioning.

在这里,您使用某种逻辑或标识符来(通常)在多个实例或服务器之间复制架构,从而知道要在哪个实例或服务器中查找数据.这种标识符通常称为分片密钥".

Here you replicate the schema across (typically) multiple instances or servers, using some kind of logic or identifier to know which instance or server to look for the data. An identifier of this kind is often called a "Shard Key".

一种常见的无键逻辑是使用字母来划分数据. A-D是实例1,E-G是实例2,依此类推.客户数据非常适合于此,但是如果不考虑分区而不考虑某些字母比其他字母更常见的情况,那么在各个实例中,其大小会被错误地表示.

A common, key-less logic is to use the alphabet to divide the data. A-D is instance 1, E-G is instance 2 etc. Customer data is well suited for this, but will be somewhat misrepresented in size across instances if the partitioning does not take in to account that some letters are more common than others.

另一种常用技术是使用密钥同步系统或逻辑,以确保实例之间具有唯一的密钥.

Another common technique is to use a key-synchronization system or logic that ensures unique keys across the instances.

您可以研究的一个著名示例是Instagram在早期如何解决其分区(请参阅下面的链接).他们开始使用很少的服务器进行分区,使用Postgres从一开始就划分数据.我相信在那几个物理碎片上只有数千个逻辑碎片.在这里阅读他们2012年的出色文章: Instagram工程-分片&编号

A well known example you can study is how Instagram solved their partitioning in the early days (see link below). They started out partitioned on very few servers, using Postgres to divide the data from the get-go. I believe it was several thousand logical shards on those few physical shards. Read their awesome writeup from 2012 here: Instagram Engineering - Sharding & IDs

也请参见此处: http://www.quora .com/分片与分区之间的区别是什么

这篇关于数据库分片与分区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆