为什么关系数据库存在可伸缩性问题? [英] Why are relational databases having scalability issues?

查看:114
本文介绍了为什么关系数据库存在可伸缩性问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近在网上阅读了一些文章,这些文章表明关系数据库存在扩展问题,在处理大数据时不适合使用。特别是在数据量很大的云计算中。但是,通过谷歌搜索,我找不到很好的可靠理由说明为什么它的扩展性不高。您能否解释一下关系数据库在可伸缩性方面的局限性?

Recenctly I read some articles online that indicates relational databases have scaling issues and not good to use when it comes to big data. Specially in cloud computing where the data is big. But I could not find good solid reasons to why it isn't scalable much, by googling. Can you please explain me the limitations of relational databases when it comes to scalability?

谢谢。

推荐答案

关系数据库根据 ACID 属性提供可靠,成熟的服务。我们获得事务处理,有效的日志记录以实现恢复等。这些是关系数据库的核心服务,也是它们擅长的服务。它们很难自定义,并且可能被视为瓶颈,特别是如果您在给定的应用程序中不需要它们时(例如,服务网站内容的重要性不高;例如,在这种情况下,广泛使用的MySQL不提供事务处理)使用默认存储引擎进行处理,因此不满足ACID)。许多大数据问题并不需要这些严格的约束条件,例如网络分析,网络搜索或处理运动对象的轨迹,因为它们本质上已经包含不确定性。

Relational databases provide solid, mature services according to the ACID properties. We get transaction-handling, efficient logging to enable recovery etc. These are core services of the relational databases, and the ones that they are good at. They are hard to customize, and might be considered as a bottleneck, especially if you don't need them in a given application (eg. serving website content with low importance; in this case for example, the widely used MySQL does not provide transaction handling with the default storage engine, and therefore does not satisfy ACID). Lots of "big data" problems don't require these strict constrains, for example web analytics, web search or processing moving object trajectories, as they already include uncertainty by nature.

当达到给定计算机的限制(内存,CPU,磁盘:数据太大或数据处理太复杂且成本很高)时,分发服务是个好主意。许多关系数据库和NoSQL数据库提供分布式存储。但是,在这种情况下,很难满足ACID的要求: CAP定理的说明有些相似,即无法同时实现可用性,一致性和分区容限。如果我们放弃ACID(例如满足BASE要求),则可扩展性可能会提高。
参见帖子,例如。根据CAP对存储方法进行分类。

When reaching the limits of a given computer (memory, CPU, disk: the data is too big, or data processing is too complex and costly), distributing the service is a good idea. Lots of relational and NoSQL databases offer distributed storage. In this case however, ACID turns out to be difficult to satisfy: the CAP theorem states somewhat similar, that availability, consistency and partition tolerance can not be achieved at the same time. If we give up ACID (satisfying BASE for example), scalability might be increased. See this post eg. for categorization of storage methods according to CAP.

另一个瓶颈可能是带有SQL操作的灵活,聪明的类型化关系模型本身:在许多情况下,具有更简单操作的更简单模型就足够了,而且效率更高(例如未类型化的键值存储)。常见的逐行物理存储模型也可能会受到限制:例如,它并不是数据压缩的最佳选择。

An other bottleneck might be the flexible and clever typed relational model itself with SQL operations: in lots of cases a simpler model with simpler operations would be sufficient and more efficient (like untyped key-value stores). The common row-wise physical storage model might also be limiting: for example it isn't optimal for data compression.

但是,存在快速且可扩展的符合ACID的关系数据库,包括 VoltDB 之类的新文件,因为关系数据库技术已经成熟,研究和广泛。我们只需为给定的问题选择一个合适的解决方案。

There are however fast and scalable ACID compliant relational databases, including new ones like VoltDB, as the technology of relational databases is mature, well-researched and widespread. We just have to select an appropriate solution for the given problem.

这篇关于为什么关系数据库存在可伸缩性问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆