如何在Dynamo db中以DynamoDBAutoGeneratedKey作为哈希密钥的Dynamo数据库性能调整我的表,因为每次插入时PutRequest都变慢 [英] How to performance tune my table in Dynamo db having DynamoDBAutoGeneratedKey as Hash Key as the PutRequest is getting slow with each insert

查看:65
本文介绍了如何在Dynamo db中以DynamoDBAutoGeneratedKey作为哈希密钥的Dynamo数据库性能调整我的表,因为每次插入时PutRequest都变慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用dynamo db表来保存API请求的事务性数据.我要维护两张桌子1.日程安排-以SId作为哈希键2.摘要-使用DynamoDBAutoGeneratedKey(UUID)作为哈希键,并将SId作为其属性.

I am using dynamo db tables for saving the transactional data for my APIs requests. I am maintaining two tables 1. schedule - with SId as hashkey 2. summary - with DynamoDBAutoGeneratedKey (UUID) as hashkey and SId as an Attribute to it.

计划表在每个请求中填充一行,而摘要表在每个SId和唯一的UUID中填充10个项目

schedule table populates a single row per request, whereas the summary table populates 10 items per SId and unique UUID

我们正在对这两个表进行负载测试,可以观察到调度表的性能很好,但是汇总表在PutRequests中为每个调用消耗了10个项目,这花费了很多时间.

We are running a load test on these two tables and it is observed that schedule table is performing well but the summary table is consuming a lot of time in PutRequests for the 10 items per call.

任何人都可以对我的汇总dynamodb表的性能调优提出建议吗?可以将UUID保留为hashkey来减慢PutItemRequest吗?

Can any one suggest on performance tuning for my summary dynamodb table? Can keeping a UUID as hashkey, slow down the PutItemRequest?

非常感谢任何帮助指针.

Any help pointers are much appreciated.

此外,我们已经激活了这些表上的流,这些流被lambda消耗以进行交叉复制.

Also, we have activated the streams on these tables which is consumed by lambda for cross replication.

推荐答案

想到的几件事:

  • 您是否有机会使用扫描?这将解释性能下降的原因,因为扫描不会利用有关DynamoDB中数据的组织方式的任何知识,而仅仅是蛮力搜索.您应该避免使用扫描,因为它们本来就很慢且昂贵.

  • Are you using scans by any chance? This would explain performance degradation, since scans do not exploit any knowledge about how data is organised in DynamoDB and are simply a brute force search. You should avoid using scans since they are inherently slow and expensive.

您有一个热分区"吗?您写道:

Do you have a "hot partition"? You wrote:

  1. 计划-使用SId作为hashkey2.摘要-使用DynamoDBAutoGeneratedKey(UUID)作为hashkey,使用SId作为对它.

对这些值的访问是否均匀分布?您是否拥有比其他人更常访问的项目?如果是这样,这可能是一个问题,如果您的大多数读/写操作都涉及一小部分ID,则意味着您要向单个分区(物理机)中填充请求.我也建议对此进行调查.

Is access to these values uniformly distributed? Do you have items that are accessed more often then others? If so, this may be an issue, if majority of your reads/writes comes to a small subset of ids, than it means that you are flooding a single partition (physical machine) with requests. I would suggest to investigate this as well.

一种解决方案是使用缓存并在其中存储经常访问的项目.您可以使用Dynamo中的新缓存解决方案 DAX .

One solution can be to use cache and store frequently accessed items there. You can use either ElasticCache or DAX - a new caching solution in Dynamo.

您可以在此处

  • 您是否正在使用交易?您写道:
  • 我正在使用dynamo db表保存事务数据

    I am using dynamo db tables for saving the transactional data

    如果这表示您正在使用DynamoDB事务,则需要阅读

    If by this you mean that you are using DynamoDB transactions, you need to read how DynamoDB implements transactions.

    长话短说,DynamoDB将存储您在执行事务时更新/删除/添加的所有项目的副本.此外,DynamoDB事务非常昂贵,每个事务需要7N + 4次写操作,其中N是事务中涉及的许多项目.

    Long story short, DynamoDB is storing copies of all items that you update/delete/add when you perform a transaction. Additionally, DynamoDB transactions are expensive and they require 7N+4 writes per transaction, where N is a number of items involved in a transaction.

    这篇关于如何在Dynamo db中以DynamoDBAutoGeneratedKey作为哈希密钥的Dynamo数据库性能调整我的表,因为每次插入时PutRequest都变慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆