如何在社交网络中实现活动流 [英] How to implement the activity stream in a social network

查看:121
本文介绍了如何在社交网络中实现活动流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发自己的社交网络,而我还没有在网页上找到实现用户流动的例子...例如,如何过滤每个用户的操作?如何存储动作事件?哪些数据模型和对象模型可以用于动作流和自己的动作?

I'm developing my own social network, and I haven't found on the web examples of implementation the stream of users' actions... For example, how to filter actions for each users? How to store the action events? Which data model and object model can I use for the actions stream and for the actions itselves?

推荐答案

strong>:对于约100万活跃用户和1.5亿个存储活动,我保持简单:

Summary: For about 1 million active users and 150 million stored activities, I keep it simple:


  • 使用关系数据库存储唯一的活动(每个活动1个记录/发生的事情)使记录尽可能紧凑。结构,以便您可以通过活动ID或使用一组具有时间限制的朋友ID快速抓取一批活动。

  • 每当创建活动记录时,将活动ID发布到Redis,将ID添加到活动流列表中,供每个用户看到该活动的朋友/订阅者。

查询Redis以获取任何用户的活动流,然后根据需要从数据库获取相关数据。如果用户需要及时浏览时间(如果您甚至提供这个),请返回查询数据库。

Query Redis to get the activity stream for any user and then grab the related data from the db as needed. Fall back to querying the db by time if the user needs to browse far back in time (if you even offer this)

我使用一个简单的旧的MySQL表来处理大约1500万个活动。

I use a plain old MySQL table for dealing with about 15 million activities.

看起来像这样:

id             
user_id       (int)
activity_type (tinyint)
source_id     (int)  
parent_id     (int)
parent_type   (tinyint)
time          (datetime but a smaller type like int would be better) 

activity_type 告诉我活动的类型, source_id 告诉我活动相关的记录。所以如果活动类型意味着添加收藏,那么我知道source_id是最喜欢的记录的ID。

activity_type tells me the type of activity, source_id tells me the record that the activity is related to. So if the activity type means "added favorite" then I know that the source_id refers to the ID of a favorite record.

parent_id / parent_type 对我的应用程序很有用 - 他们告诉我这个活动是相关的。如果一本书被收藏,那么parent_id / parent_type会告诉我,该活动与一个给定主键(id)的书(类型)有关

The parent_id/parent_type are useful for my app - they tell me what the activity is related to. If a book was favorited, then parent_id/parent_type would tell me that the activity relates to a book (type) with a given primary key (id)

I index on (user_id,time)并查询 user_id IN(... friends ...)AND time>一些截止点。打开ID并选择不同的聚集索引可能是一个好主意 - 我还没有尝试过。

I index on (user_id, time) and query for activities that are user_id IN (...friends...) AND time > some-cutoff-point. Ditching the id and choosing a different clustered index might be a good idea - I haven't experimented with that.

相当基本的东西,但它的工作原理很简单,随着您的需求变化,这是很容易的。另外,如果你不使用MySQL,你可能会做出更好的索引。

Pretty basic stuff, but it works, it's simple, and it is easy to work with as your needs change. Also, if you aren't using MySQL you might be able to do better index-wise.

最近的活动,我一直在尝试 Redis 。 Redis将其所有数据存储在内存中,因此您无法将所有活动放在内存中,但您可以在您的网站上为大多数常用屏幕进行存储。每个用户最近的100个或类似的东西。使用Redis可以这样工作:

For faster access to the most recent activities, I've been experimenting with Redis. Redis stores all of its data in-memory, so you can't put all of your activities in there, but you could store enough for most of the commonly-hit screens on your site. The most recent 100 for each user or something like that. With Redis in the mix, it might work like this:


  • 创建MySQL活动记录

  • 对于创建活动的用户的每个朋友,将该ID推送到Redis的活动列表中。

  • 将每个列表修改到最后一个X项目

Redis快速提供管道命令跨越一个连接 - 所以推动一个活动到1000个朋友需要毫秒。

Redis is fast and offers a way to pipeline commands across one connection - so pushing an activity out to 1000 friends takes milliseconds.

有关我正在谈论的更详细的解释,请参阅Redis的Twitter示例: http://redis.io/topics/twitter-clone

For a more detailed explanation of what I am talking about, see Redis' Twitter example: http://redis.io/topics/twitter-clone

2011年2月更新目前我已经有5000万活跃的活动,我没有任何改变。做一个类似这样的事情的一件好事是它使用紧凑的小行。我正在计划进行一些更改,包括更多的活动和更多的这些活动的查询,我一定会使用Redis来保持事情的快速。我在其他地区使用Redis,对于某些类型的问题,它的效果很好。

Update February 2011 I've got 50 million active activities at the moment and I haven't changed anything. One nice thing about doing something similar to this is that it uses compact, small rows. I am planning on making some changes that would involve many more activities and more queries of those activities and I will definitely be using Redis to keep things speedy. I'm using Redis in other areas and it really works well for certain kinds of problems.

2014年7月更新每月活跃用户700K。在过去的几年中,我一直在使用Redis(如项目符号列表所述),用于存储每个用户的最近1000个活动ID。系统中通常有大约1亿个活动记录,它们仍然存储在MySQL中,并且仍然是相同的布局。这些记录让我们可以减少Redis记忆,它们作为活动数据的记录,如果用户需要及时寻找更多的东西,我们会使用它们。

Update July 2014 We're up to about 700K monthly active users. For the last couple years, I've been using Redis (as described in the bulleted list) for storing the last 1000 activity IDs for each user. There are usually about 100 million activity records in the system and they are still stored in MySQL and are still the same layout. These records let us get away with less Redis memory, they serve as the record of activity data, and we use them if users need to page further back in time to find something.

这不是一个聪明或特别有趣的解决方案,但它已经很好了。

This wasn't a clever or especially interesting solution but it has served me well.

这篇关于如何在社交网络中实现活动流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆