何时使用dynamoDB -UseCases [英] When to use dynamoDB -UseCases

查看:82
本文介绍了何时使用dynamoDB -UseCases的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图找出最适合Amazon dynamoDB的用例。

I've tried to figure out what will be the best use cases that suit for Amazon dynamoDB.

我在Google上搜索的大多数博客都说将使用DyanmoDb仅用于大量数据(BigData)。

When I googled most of the blogs says DyanmoDb will be used only for a large amount of data (BigData).

我有关系数据库的背景。 NoSQL DB对我来说是新的。因此,当我尝试将其与常规关系数据库知识联系起来时。

I'm having a background of relational DB. NoSQL DB is new for me.So when I've tried to relate this to normal relation DB knowledge.

与DynamoDb相关的大多数概念是创建一个架构-较少带有分区键/排序键的表。并尝试根据键查询它们。此外,没有这样的存储过程概念可以使查询更容易和简单。

Most of the concepts related to DynamoDb is to create a schema-less table with partition keys/sort keys. And try to query them based on the keys.Also, there is no such concept of stored procedure which makes queries easier and simple.

如果我们管理如此庞大的数据,就可以做到这一点。每次都需要进行复杂的查询来检索数据,这将是没有存储过程的正确方法吗?

If we managing such huge Data's doing such complex queries each and every time to retrieve data will be the correct approach without a stored procedure?

注意:我可能有一个错误对概念的理解。所以,请任何人在这里清除我的想法

Note: I've maybe had a wrong understanding of the concept. So, please anyone clear my thoughts here

在此先感谢

杰伊

Thanks in advance
Jay

推荐答案

简而言之,像DynamoDB这样的系统旨在通过水平扩展而不是垂直扩展来支持大数据集(太大而无法容纳单个服务器)和高写入/读取吞吐量。从历史上看,关系数据库更常用的方法。

In short, systems like DynamoDB are designed to support big data sets (too big to fit a single server) and high write/read throughput by scaling horizontally, as opposed to scaling vertically, which is the more common approach for relational databases historically.

支持水平可伸缩性的主要方法是对数据进行分区,即,将数据集分为多个部分并分布在多个服务器之间。这样,它可能会使用更多的存储空间和更多的IOPS,从而允许更大的数据集和更高的读/写吞吐量。

The main approach to support horizontal scalability is by partitioning data, i.e. a data set is split into multiple pieces and distributed among multiple servers. This way it may use more storage and more IOPS, allowing bigger data sets and higher read/write throughput.

但是,数据分区很难支持复杂的查询,例如例如联接等,因为数据分布在多个物理服务器之间。对于存储过程,由于相同的原因不支持它们-历史上,存储过程背后的想法是数据局部性,即它们在数据附近的服务器上运行而不进行网络操作,但是,如果数据分布在多个服务器之间,则这种好处消失了(至少以存储过程的形式)。

However, data partitioning makes it difficult to support complex queries, such as joins etc., as data is distributed among multiple physical servers. As for stored procedures, they are not supported for the same reason - historically the idea behind stored procedures is data locality, i.e. they run on the server near the data without network operations, however, if data is distributed among multiple servers, this benefit disappears (at least in the form of stored procedure).

因此,从此类系统中查询数据的最有效方法是通过记录键,因为数据分区基于键,并且很容易找出记录在物理上的位置对于给定的密钥。尽管许多此类系统还支持二级索引,但它们通常以某种方式受到限制或价格昂贵,并且可能不足以满足复杂软件解决方案中的要求。一种非常普遍的方法是拥有一个互补的索引/查询解决方案(我见过基于Elasticsearch和Solr的解决方案),该解决方案允许对某些记录片段运行复杂的查询以找出记录键,然后将其用于加载记录

Therefore the most efficient way to query data from such systems is by record key, as data partitioning is based on a key and it's easy to figure out where a record lives physically for a given key. While many such systems also support secondary indexes, they are usually restricted in some way or expensive and may not be enough to satisfy requirements in a complex software solution. A quite common approach is to have a complementary indexing/query solution (I've seen solutions based on Elasticsearch and Solr), which allows running complex queries over some fragments of records to figure out a record key, which then used to load the record.

这篇关于何时使用dynamoDB -UseCases的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆