如何使用没有热分区的 Amazon DynamoDB 为论坛建模? [英] How to model a forum using Amazon DynamoDB without hot-partitions?

查看:22
本文介绍了如何使用没有热分区的 Amazon DynamoDB 为论坛建模?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

AWS DynamoDB 文档包含一个示例

您可以使用上述架构.现在为您查询

  1. 论坛主题(按发布日期或最近回复排序)

     从 GSI1 中选择,其中 GSI1 pk=Forum123 并按 GSI1 SK 排序

    您可以根据经常询问的用例选择将谁保留在 GSI1 Sk 最近的回复/发布日期.

  2. 按主题回复(按发布日期排序)

     选择 pk=topic 和 sk 开始的位置,reply 和 sortby sk

  3. 用户回复(按发布日期排序)

     从 GSI2 中选择,其中 pk=User123 和 sk 以回复开头并按 sk 排序

  4. 用户主题(按发布日期排序)

     从 GSI2 中选择,其中 pk=User123 和 sk 以主题开头并按 sk 排序

  5. 得票最多的话题

如果您想跨多个论坛执行此操作,这将需要另一个 GSI.但是这个GSI肯定会遇到热点问题.因为只有一把钥匙.您可以在表中保留一个固定的键值来保存这些计数,而不是这样做.并且这些值由异步进程更新.

The AWS DynamoDB documentation includes an example schema for a forum. However, the number of questions this schema is able to answer seems very small. In addition, the table seems to suffer from a hot-key problem (a burst of replies backs up on the same partition).

In a talk title "Advanced Design Patterns for Amazon DynamoDB" the presenter around 43 minutes breaks down a complex use-case from Audible using only a single table with 3 GSI (indexes).

I'm trying to learn proper DynamoDB modeling coming from a standard RDBMS 3NF background. How would a forum be designed to prevent hot-partitions while still meeting these common use-cases?

Queries:

  • Topics by Forum (sorted by date posted, or most recent reply)
  • Replies by Topic (sorted by date posted with pagination)
  • Replies by User (sorted by date posted)
  • Topics by User (sorted by date posted)
  • Topics with most votes

Basic Schema(?):

  • Forum: Partition key: Forum_GUID. Attributes: Name, Desc
  • User: Partition key: User_GUID. Attributes: email, join_date
  • Thread: Composite key: Forum_GUID, Topic_GUID. Attributes: posted_by, date, votes, body, subject
  • Reply: Composite key: Topic_GUID, Reply_GUID. Attributes: posted_by, date, votes, body

I'm assuming there are multiple solutions (including using a single table). I'm looking for any answer that can solve this while providing guidance on when, and how, to

You can use the above schema. Now for you queries

  1. Topics by Forum (sorted by date posted, or most recent reply)

     Select from GSI1 where GSI1 pk=Forum123 and sortby GSI1 SK
    

    you can choose whom to keep in GSI1 Sk recent reply/date posted based on which use case is frequently asked.

  2. Replies by Topic (sorted by date posted with pagination)

     Select where pk=topic and sk startswith reply and sortby sk
    

  3. Replies by User (sorted by date posted)

     Select from GSI2 where pk=User123 and sk startswith reply and sortby sk
    

  4. Topics by User (sorted by date posted)

     Select from GSI2 where pk=User123 and sk startswith topic and sortby sk
    

  5. Topics with most votes

This will require another GSI if you want to do this operation across multiple forums. but This GSI will certainly suffer from hot key issue. since there will be only one key. Instead of doing that, you can keep one fixed key value in your table who keeps these counts. and these values are updated by an async process.

这篇关于如何使用没有热分区的 Amazon DynamoDB 为论坛建模?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆