获取 DynamoDB/AWS 生态系统中的热门帖子 [英] Get trending posts in DynamoDB / AWS Ecosystem

查看:29
本文介绍了获取 DynamoDB/AWS 生态系统中的热门帖子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试构建我自己的社交网络/论坛应用程序,人们可以在其中添加和喜欢彼此的帖子.我使用 DynamoDB 作为我的数据库和一个表.对于帖子点赞功能,我使用 Lambda 函数 结合 DynamoDB-Streams 来聚合类似属性.

目前我正在研究这些用户帖子的排名机制.有了这个,我想确保我的用户可以在那个时间点在论坛中列出有趣的帖子.
为此,我阅读了 reddit 如何处理它的排名算法 页面.
我也在 Stackoverflow 上阅读这个 问题,它离我很近,但没有很好的答案imo.

我的问题是,如何在 AWS 生态系统的帮助下解决这个问题(甚至可能仅使用 DynamoDB 和 Lambda 函数?)


我的数据库架构如下所示:

Im trying to build my own social network / forum application, where people can add and like each others posts. Im using DynamoDB as my database with a single table. For the post liking functionality Im using a Lambda Function in combination with DynamoDB-Streams which aggregates the like attribute.

Currently Im working on a ranking mechanism for these user posts. With that I want to make sure my users can list the interesting posts in a forum in that point of time.
For that purpose, I read how reddit handles its ranking algorithm on this page.
I also read this question on Stackoverflow which is near to my, without a good answer imo.

My question is, how one would solve this problem with the help of the AWS ecosystem (Maybe even with DynamoDB and Lambda Functions alone ?)


My database schema looks something like this:

Partitionkey                                     Sortkey             likes       ...
----------                                       --------            ------
forum#soccer                                     01.08.19 13:15
forum#baseball                                   22.08.19 20:11
post#soccer#Do you think FC Barcelona wins?      05.08.19 10:20       203
post#soccer#Which club is your favorite ?        05.08.19 10:20       2
like#Which club is your favorite ?               John Wick
like#Which club is your favorite ?               Walter White
...

每次插入以 like# 开头的项目时,都会触发 lambda 函数并更新列喜欢的帖子条目.
我的目标是查询当前最流行的帖子.这应该可以通过可用信息(例如创建时间和帖子计数)来实现.目前我的查询只是返回最新的帖子

With each insert of an item which starts with like# a lambdafunction is getting triggered and updates the post entry on column likes.
My aim is to query the trendiest posts of the current time. This should be possible with the available information like the creation time and like count of the post. Currently my query is just returing the newest posts

推荐答案

我将提供一个可能的解决方案,仅考虑 DynamoDB 和 Lambda(可能还有 AWS SQS).如果不合适,我们可能会考虑使用其他解决方案,例如 Amazon ElastiCache.

  1. 您的 DynamoBD 表将有一个名为 trending#posts、只有 trending(这取决于您)并将关键字排序为日期或帖子类型(或您想要排序的任何内容.您可能想要分析随时间的趋势 - 使用排序关键字作为日期 - 或按帖子类型过滤趋势).或者,如果您不需要过滤器,您可以只使用一个值.

  1. Your DynamoBD table will have an item with a partition key (NOTE 1) named trending#posts, only trending (it's up to you) and sort key as date or type of post (or anything you want to sort. You may want to analyze the trending over time - using sort key as date - or filter trendings by post type). Or if you don't want filters, you might use just a single value.

帖子中的每个赞都会触发 Lambda 来处理热门帖子(注意 2).

Each like in a post will trigger a Lambda which will handle trending posts (NOTE 2).

触发后,Lambda 将收到喜欢的帖子并执行:

When triggered, the Lambda will receive the liked post and will perform:

  1. 阅读保存在表格中的所有 N 个热门帖子.

阅读这些帖子的点赞数和发布时间.

Read number of likes and post time of those posts.

在当前 N 个帖子中执行趋势评分,如果喜欢的帖子与这些帖子不同,也在新帖子中进行评分.

Perform the trending score in the current N posts and, if the liked post is different from those, in the new post too.

再次对帖子进行排序,并将得分最高的 N 保存在您的表格中.

Sort again the posts and save the N with greatest score in your table.

<小时>

注意 1:您不需要随着时间的推移获得确切的分数,只需要有排名.我的意思是,如果你在上午 9 点保存趋势,那么你不需要在下午 1 点的正确趋势,只需要第 1、第 2 个的位置......你只需要在出现新的喜欢时的新分数.


NOTE 1: you don't need to have the exact score over time, just the ranking. I mean, if you save the trending at 9 A.M., you don't need the correct trending at 1 P.M., just the position of the 1st, 2nd... You just need the new score when a new like occurs.

注意 2:我说也许还有 AWS SQS",因为用户可能同时喜欢帖子,而 Lambda 会并发执行,可能会出现一致性问题.使用 AWS SQS,每个点赞都会将事件推送到触发 Lambda 的 SQS.这样 Lambda 就不会同时执行了.

NOTE 2: I said "and maybe AWS SQS" because users may like posts at the same time and Lambda would be executed concurrently and consistency problems may happen. With AWS SQS, each like will push the event to SQS which triggers the Lambda. This way Lambdas will not be executed at the same time.

这篇关于获取 DynamoDB/AWS 生态系统中的热门帖子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆