DynamoDB邻接列表是否应该使用离散分区键来为每种类型的关系建模? [英] Should DynamoDB adjacency lists use discrete partition keys to model each type of relationship?

查看:98
本文介绍了DynamoDB邻接列表是否应该使用离散分区键来为每种类型的关系建模?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在建立一个论坛并研究使用DynamoDB和邻接表对数据建模。一些顶级实体(例如用户)可能与其他顶级实体(例如评论)具有多种类型的关系。

I am building a forum and investigating modeling the data with DynamoDB and adjacency lists. Some top-level entities (like users) might have multiple types of relationships with other top-level entities (like comments).

例如,假设我们希望能够执行以下操作:

For example, let's say we want be able to do the following:


  • 用户可以发表评论

  • 用户可以关注评论

  • 评论可以显示喜欢它的用户

  • 评论可以显示关注它的用户

  • 用户个人资料可以显示喜欢的评论

  • 用户个人资料可以显示他们关注的评论

  • Users can like comments
  • Users can follow comments
  • Comments can display users that like it
  • Comments can display users that follow it
  • User profiles can show comments they like
  • User profiles can show comments they follow

因此,我们基本上是多对多的用户(用户< =>评论)

So, we essentially have a many-to-many (user <=> comment) to many (like or follow).

注意:此示例是故意删除的,实际上,将有更多的模型关系,所以我我试图在这里考虑一些可扩展的东西。

以下顶级数据

First_id(Partition key)         Second_id(Sort Key)         Data
-------------                   ----------                  ------
User-Harry                      User-Harry                  User data
User-Ron                        User-Ron                    User data
User-Hermione                   User-Hermione               User data
Comment-A                       Comment-A                   Comment data
Comment-B                       Comment-B                   Comment data
Comment-C                       Comment-C                   Comment data

此外,在下面的每个表格中,将存在一个等效的全局二级索引,其中交换了分区和排序键。

Furthermore, for each table below, there would be an equivalent Global Secondary Index with the partition and sort keys swapped.

这就是我要在DynamoDB中建模的模型:

This is what I would like to model in DynamoDB:


  1. Harry喜欢评论A

  2. Harry喜欢评论B

  3. Harry关注评论A

  4. 罗恩(Ron)喜欢评论B

  5. 赫敏(Hermione)喜欢评论C

  1. Harry likes comment A
  2. Harry likes comment B
  3. Harry follows comment A
  4. Ron likes comment B
  5. Hermione likes comment C



O第1部分



使用第三个属性定义关系的类型:

Option 1

Use a third attribute to define the type of relationship:

First_id(Partition key)         Second_id(Sort Key)         Data
-------------                   ----------                  ------
Comment-A                       User-Harry                  "LIKES"
Comment-B                       User-Harry                  "LIKES"
Comment-A                       User-Harry                  "FOLLOWS"
Comment-B                       User-Ron                    "LIKES"
Comment-C                       User-Hermione               "FOLLOWS"

此方法的缺点是存在多余查询结果中的信息,因为它们会返回您可能不在乎的多余项目。例如,如果您要查询喜欢给定评论的所有用户,那么您还必须处理遵循给定评论的所有用户。同样,如果您要查询用户喜欢的所有注释,则需要处理用户遵循的所有注释

The downside to this approach is that there is redundant information in query results, because they will return extra items you maybe don't care about. For example, if you want to query all the users that like a given comment, you're also going to have to process all the users that follow a that given comment. Likewise, if you want to query all the comments that a user likes, you need to process all the comments that a user follows.

修改表示关系的键:

First_id(Partition key)         Second_id(Sort Key)
-------------                   ----------
LikeComment-A                   LikeUser-Harry
LikeComment-B                   LikeUser-Harry
FollowComment-A                 FollowUser-Harry
LikeComment-B                   LikeUser-Ron
FollowComment-C                 FollowUser-Hermione

这样可以有效地进行独立查询:

This makes it efficient to query independently:


  1. 评论喜欢

  2. 评论关注

  3. 用户喜欢

  4. 用户关注

  1. Comment likes
  2. Comment follows
  3. User likes
  4. User follows

缺点是同一顶级实体现在具有多个键,随着添加更多的关系,这可能使事情变得复杂。

The downside is that the same top-level entity now has multiple keys, which might make things complex as more relationships are added.

完全跳过邻接表并使用单独的表,也许一个用于用户,一个用于喜欢,一个用于关注

Skip adjacency lists altogether and use separate tables, maybe one for Users, one for Likes, and one for Follows.

传统关系数据库。尽管我不打算走这条路线,因为这是一个个人项目,并且我想探索DynamoDB,但如果这 是思考问题的正确方法,我很想听听为什么。

Traditional relational database. While I'm not planning on going this route because this is a personal project and I want to explore DynamoDB, if this is the right way to think about things, I'd love to hear why.

感谢您阅读本文!如果有什么我可以简化问题或澄清的事情,请告诉我:)

Thanks for reading this far! If there is anything I can do to simplify the question or clarify anything, please let me know :)

我看过 AWS最佳做法以及此 many-to-many SO post ,但似乎都没有解决多对多问题(有很多)关系,因此任何资源或指导都将不胜感激。

I've looked at the AWS best practices and this many-to-many SO post and neither appears to address the many-to-many (with many) relationship, so any resources or guidance greatly appreciated.

推荐答案

您的选择1不可能,因为它没有唯一的主键。在示例数据中,您可以看到您有两个(评论A,用户哈里)条目。

Your Option 1 is not possible because it does not have unique primary keys. In your sample data, you can see that you have two entries for (Comment-A, User-Harry).

解决方案1 ​​

实现所需内容的方法是为表和GSI使用略有不同的属性。如果Harry喜欢Comment A,则您的属性应为:

The way to implement what you are looking for is by using slightly different attributes for your table and the GSI. If Harry likes Comment A, then your attributes should be:

hash_key: User-Harry
gsi_hash_key: Comment-A
sort_key_for_both: Likes-User-Harry-Comment-A

现在您只有一个分区表和GSI中顶级实体的键值,您可以使用 begins_with 运算符查询特定的关系类型。

Now you have only one partition key value for your top level entities in both the table and the GSI, and you can query for a specific relationship type by using the begins_with operator.

解决方案2

您可以将关系设为顶级实体。例如, Likes-User-Harry-Comment-A 在数据库中将有两个条目,因为它与两个 User-Harry 注释A

You could make the relationship a top-level entity. For example, Likes-User-Harry-Comment-A would have two entries in the database because it is "adjacent to" both User-Harry and Comment A.

如果希望将来对关系的复杂信息建模(包括描述关系之间的关系的能力,例如喜欢用户罗恩用户哈里 原因 关注用户罗恩用户哈里)。

This allows you flexibility if you want to model more complex information about the relationships in the future (including the ability to describe the relationship between relationships, such as Likes-User-Ron-User-Harry Causes Follows-User-Ron-User-Harry).

但是,此策略需要将更多项目存储在数据库中,这意味着要保存一个喜欢(以便可以查询)不是原子操作。 (但是您可以通过仅编写关系实体来解决该问题,然后使用DynamoDBStreams + Lambda来为我在此解决方案开头提到的两个条目编写条目。)

However, this strategy requires more items to be stored in the database, and it means that saving a "like" (so that it can be queried) is not an atomic operation. (But you can work around that by only writing the relationship entity, and then use DynamoDBStreams + Lambda to write entries for two entries I mentioned at the beginning of this solution.)

更新:使用DynamoDB Transactions,以这种方式保存赞实际上可以是完全ACID操作。

这篇关于DynamoDB邻接列表是否应该使用离散分区键来为每种类型的关系建模?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆