DynamoDB:什么时候使用什么 PK 类型? [英] DynamoDB: When to use what PK type?

查看:23
本文介绍了DynamoDB:什么时候使用什么 PK 类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试阅读有关 DynamoDB 的最佳实践.我看到DynamoDB有两种PK类型:

I am trying to read up on best practices on DynamoDB. I saw that DynamoDB has two PK types:

  1. 哈希键
  2. 哈希和范围键

从我读到的内容来看,后者与前者类似,但支持对有限列集进行排序和索引.

From what I read, it appears the latter is like the former but supports sorting and indexing of a finite set of columns.

所以我的问题是为什么只使用散列键而不使用范围键?只有在不搜索表的情况下才是可行的选择吗?

So my question is why ever use only a hash key without a range key? Is it a viable choice only when the table is not searched?

最好有一些关于何时使用什么键类型的通用指南.我已经阅读了几个指南(包括亚马逊自己关于 DynamoDB 的文档),但它们似乎都没有直接解决这个问题.

It'd also be great to have some general guidelines on when to use what key type. I've read several guides (including Amazon's own documentation on DynamoDB) but none of them appear to directly address this question.

谢谢

推荐答案

选择使用哪个键取决于特定场景的用例和数据要求.例如,如果您要存储用户会话数据,则使用范围键可能没有多大意义,因为每个记录都可以通过 GUID 引用并直接访问而无需分组要求.一般而言,一旦您知道会话 ID,您就可以通过键获得特定的项目.另一个示例可能是存储用户帐户或配置文件数据,每个用户都有自己的数据,您很可能会直接访问它(通过用户 ID 或其他方式).

The choice of which key to use comes down to your Use Cases and Data Requirements for a particular scenario. For example, if you are storing User Session Data it might not make much sense using the Range Key since each record could be referenced by a GUID and accessed directly with no grouping requirements. In general terms once you know the Session Id you just get the specific item querying by the key. Another example could be storing User Account or Profile data, each user has his own and you most likely will access it directly (by User Id or something else).

但是,如果您要存储Order Items,那么Range Key 更有意义,因为您可能希望检索按Order 分组的项目em>.

However, if you are storing Order Items then the Range Key makes much more sense since you probably want to retrieve the items grouped by their Order.

数据模型而言,哈希键允许您从表中唯一标识一条记录,而范围键可以可选地用于对通常一起检索的几条记录进行分组和排序.示例:如果您正在定义一个聚合来存储 Order ItemsOrder Id 可以是您的 Hash Key,而 OrderItemId 范围键.每当您想从特定的Order 中搜索Order Items 时,您只需通过哈希键(Order Id)进行查询,您将获得所有订单项.

In terms of the Data Model, the Hash Key allows you to uniquely identify a record from your table, and the Range Key can be optionally used to group and sort several records that are usually retrieved together. Example: If you are defining an Aggregate to store Order Items, the Order Id could be your Hash Key, and the OrderItemId the Range Key. Whenever you would like to search the Order Items from a particular Order, you just query by the Hash Key (Order Id), and you will get all your order items.

您可以在下面找到这两个键的使用的正式定义:

You can find below a formal definition for the use of these two keys:

"Composite Hash Key with Range Key 允许开发者创建一个主键是两个属性的组合,一个哈希"属性"和范围属性".查询复合时键,哈希属性需要唯一匹配但范围可以为范围属性指定操作:例如所有订单过去 24 小时内来自维尔纳,或个人参加的所有比赛过去 24 小时内的玩家."[VOGELS]

"Composite Hash Key with Range Key allows the developer to create a primary key that is the composite of two attributes, a 'hash attribute' and a 'range attribute.' When querying against a composite key, the hash attribute needs to be uniquely matched but a range operation can be specified for the range attribute: e.g. all orders from Werner in the past 24 hours, or all games played by an individual player in the past 24 hours." [VOGELS]

所以Range Key数据模型增加了分组能力,但是,这两个键的使用也对存储模型有影响:

So the Range Key adds a grouping capability to the Data Model, however, the use of these two keys also have an implication on the Storage Model:

"Dynamo 使用一致的散列将其键空间划分为副本并确保均匀的负载分布.统一的钥匙分布可以帮助我们实现均匀的负载分布假设密钥的访问分布并没有高度倾斜."[DDB-SOSP2007]

"Dynamo uses consistent hashing to partition its key space across its replicas and to ensure uniform load distribution. A uniform key distribution can help us achieve uniform load distribution assuming the access distribution of keys is not highly skewed." [DDB-SOSP2007]

Hash Key 不仅可以唯一标识记录,而且还是保证负载分配的机制.Range Key(使用时)有助于指示将大部分一起检索的记录,因此,也可以针对此类需求优化存储.

Not only the Hash Key allows to uniquely identify the record, but also is the mechanism to ensure load distribution. The Range Key (when used) helps to indicate the records that will be mostly retrieved together, therefore, the storage can also be optimized for such need.

选择正确的键来表示您的数据是您设计过程中最关键的方面之一,它直接影响您的应用程序的性能、规模和成本.

Choosing the correct keys to represent your data is one of the most critical aspects during your design process, and it directly impacts how much your application will perform, scale and cost.

脚注:

  • 数据模型是我们感知和操作数据的模型.它描述了我们如何与数据库 [FOWLER] 中的数据交互.换句话说,它是您抽象数据模型的方式、您对实体进行分组的方式、您选择作为主键的属性等

  • The Data Model is the model through which we perceive and manipulate our data. It describes how we interact with the data in the database [FOWLER]. In other words, it is how you abstract your data model, the way you group your entities, the attributes that you choose as primary keys, etc

存储模型描述了数据库如何在内部存储和操作数据 [FOWLER].虽然您无法直接控制这一点,但您当然可以通过了解数据库内部的工作方式来优化检索或写入数据的方式.

The Storage Model describes how the database stores and manipulates the data internally [FOWLER]. Although you cannot control this directly, you can certainly optimize how the data is retrieved or written by knowing how the database works internally.

这篇关于DynamoDB:什么时候使用什么 PK 类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆