dynamodb 表中的哈希范围有什么用? [英] What is the use of a Hash range in a dynamodb table?

查看:17
本文介绍了dynamodb 表中的哈希范围有什么用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 dynamodb (ddb) 的新手.我正在浏览它的文档,它说要添加哈希键和哈希范围键.在文档中,它说 ddb 将在哈希键上创建一个排序索引,并在哈希范围上创建一个排序索引.

I am new to dynamodb (ddb). I was going through its documentation and it says to add Hash Key and a Hash Range key. In the documentation it says that ddb will create an usorted index on the hash key and a sorted index on the hash range.

拥有这 2 把钥匙而不是只有一把钥匙的目的是什么.是不是因为第一个键的使用方式如下:一个哈希表,其中包含:key - 哈希范围内每个值的键范围

What is the purpose of having these 2 keys rather than just one key. Is it because the first key is used like : A HashTable which contains : key - range of keys for each value in the hash range

第二个哈希表哈希范围键 - 实际数据值.

2nd HashTable hash range key - Actual data value.

这将有助于隔离数据并加快查找速度.但是为什么只有 2 级 HashMap,我可以对 n 层执行此操作并获得更快的查找.

This would help segregate data and make lookup fast. But then why only 2 levels of HashMaps, I could do this for n number of layers and get the faster lookups.

提前谢谢你.

推荐答案

问:拥有这两个键而不是一个键的目的是什么?"

数据模型而言,Hash Key 允许您从表中唯一标识一条记录,Range Key 可用于对通常一起检索的多条记录进行分组和排序.示例:如果您要定义一个聚合来存储订单项,则 OrderId 可以是您的 Hash Key,而 OrderItemId 可以是 Range Key.您可以在下面找到使用这两个键的正式定义:

In terms of the Data Model, the Hash Key allows you to uniquely identify a record from your table, and the Range Key can be optionally used to group and sort several records that are usually retrieved together. Example: If you are defining an Aggregate to store Order Items, the OrderId could be your Hash Key, and the OrderItemId the Range Key. You can find below a formal definition for the use of these two keys:

"Composite Hash Key with Range Key 允许开发者创建一个主键是两个属性的组合,一个'散列属性"和范围属性".查询复合材料时key,hash属性需要唯一匹配但是一个范围可以为范围属性指定操作:例如所有订单过去 24 小时内来自维尔纳的比赛,或个人参加的所有比赛过去 24 小时内的玩家." [VOGELS]

"Composite Hash Key with Range Key allows the developer to create a primary key that is the composite of two attributes, a 'hash attribute' and a 'range attribute.' When querying against a composite key, the hash attribute needs to be uniquely matched but a range operation can be specified for the range attribute: e.g. all orders from Werner in the past 24 hours, or all games played by an individual player in the past 24 hours." [VOGELS]

所以 Range KeyData Model 添加了分组功能,然而,这两个键的使用也对 Storage Model:

So the Range Key adds a grouping capability to the Data Model, however, the use of these two keys also have an implication on the Storage Model:

"Dynamo 使用一致的散列将其键空间划分为副本并确保均匀的负载分布.统一的钥匙分布可以帮助我们实现均匀的负载分布,假设密钥的访问分布没有高度倾斜."[DDB-SOSP2007]

"Dynamo uses consistent hashing to partition its key space across its replicas and to ensure uniform load distribution. A uniform key distribution can help us achieve uniform load distribution assuming the access distribution of keys is not highly skewed." [DDB-SOSP2007]

Hash Key不仅可以唯一标识记录,还是保证负载分布的机制.范围键(使用时)有助于指示将大部分一起检索的记录,因此,也可以针对这种需要优化存储.

Not only the Hash Key allows to uniquely identify the record, but also is the mechanism to ensure load distribution. The Range Key (when used) helps to indicate the records that will be mostly retrieved together, therefore, the storage can also be optimized for such need.

问:但是为什么只有 2 级 HashMap?我可以对 n 层执行此操作并获得更快的查找速度."

为了在集群环境中有效运行数据库,拥有多层查找将增加指数级复杂性,这是大多数 NOSQL 数据库最重要的用例之一.数据库必须是高度可用的、防故障的、有效可扩展的,并且仍然在分布式环境中执行.

Having many layers of lookups will add exponential complexity to effectively run the database in a cluster environment , which is one of the most essential use cases for the majority of NOSQL databases. The database has to be highly available, failure-proof, effectively scalable, and still perform in a distributed environment.

Dynamo 的关键设计要求之一是它必须可扩展逐渐地.这需要一种机制来动态分区系统中节点集(即存储主机)上的数据.Dynamo 的分区方案依赖于一致性哈希跨多个存储主机分配负载."[DDB-SOSP2007]

"One of the key design requirements for Dynamo is that it must scale incrementally. This requires a mechanism to dynamically partition the data over the set of nodes (i.e., storage hosts) in the system. Dynamo’s partitioning scheme relies on consistent hashing to distribute the load across multiple storage hosts."[DDB-SOSP2007]

这始终是一种权衡,您在 NOSQL 数据库中看到的每一个限制很可能是由存储模型要求引入的.尽管关系数据库在数据建模方面非常灵活,但在分布式环境中运行时存在一些限制.

It is always a trade off, every single limitation that you see in NOSQL databases are most likely introduced by the storage model requirements. Although Relational Databases are very flexible in terms of data modeling they have several limitations when it comes to run in a distributed environment.

选择正确的键来表示您的数据是您设计过程中最关键的方面之一,它直接影响您的应用程序的性能、规模和成本.

Choosing the correct keys to represent your data is one of the most critical aspects during your design process, and it directly impacts how much your application will perform, scale and cost.

脚注:

  • 数据模型是我们感知和操作数据的模型.它描述了我们如何与数据库中的数据交互 [FOWLER].换句话说,它是您抽象数据模型的方式、对实体进行分组的方式、选择作为主键的属性等

  • The Data Model is the model through which we perceive and manipulate our data. It describes how we interact with the data in the database [FOWLER]. In other words, it is how you abstract your data model, the way you group your entities, the attributes that you choose as primary keys, etc

存储模型描述了数据库如何在内部存储和操作数据 [FOWLER].尽管您无法直接控制这一点,但您当然可以通过了解数据库内部的工作方式来优化数据的检索或写入方式.

The Storage Model describes how the database stores and manipulates the data internally [FOWLER]. Although you cannot control this directly, you can certainly optimize how the data is retrieved or written by knowing how the database works internally.

这篇关于dynamodb 表中的哈希范围有什么用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆