MySQL对社交好友图的优化(按好友分组) [英] MySQL Optimization for Social Friends Graph (Group By Friends)

查看:111
本文介绍了MySQL对社交好友图的优化(按好友分组)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在查询和扩展查询方面存在一些问题,以提高与好友数量较高的用户的关系.查询的目的是抓住朋友在过去30天内执行的最重要的活动".这是我的查询:

I'm having a bit of an issue with a query and scaling that query for perfomance with users with high amount of friends. The goal of query is grab the top "activities' performed by your friends in the last 30 days. Here is my query:

SELECT a.activity_id, b.activity_name, count(a.activity_id) as total_count
FROM friends as f
INNER JOIN activities as a on (a.user_id = f.friend_id 
and a.created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
INNER JOIN activity as b on a.activity_id = b.activity_id
WHERE f.user_id = 1 and f.is_approved = 1
GROUP by a.activity_id
ORDER by total_count DESC
LIMIT 5

对于所有用户,无论其朋友图有多大,该查询都将花费大约25秒的时间来运行.索引如下:

This query takes like 25 seconds to run for all users no matter how big or small their friends graph is. Indexes are below:

Table: activities
PRIMARY: [act_id] Other: [activity_id, user_id], [user_id, created_at], [created_at]

Table: friends
PRIMARY: [user_id, friend_id] Other: [user_id, is_approved], [friend_id]

Table: activity:
PRIMARY: [activity_id]

任何帮助将不胜感激.

Any help would be greatly appreciated.

更新:这是解释

id  select_type     table   key             key_len         ref             rows    Extra
1   SIMPLE   F  ref     friend_lookup   5   const,const     795     Using temporary; Using filesort
1   SIMPLE A    ref     user_id         4   F.friend_id     58      Using where
1   SIMPLE    B     eq_ref  PRIMARY         4   P.activty_id    1       Using where

推荐答案

Robin在日期字段上是正确的.如果您使用的是函数,则无论扫描多少条目,它都必须计算该函数.我下面的方式使用MySQL变量.我将它一次计算到@StartDate中,并将THAT值用于join子句.

Robin is correct on the date field. If you are using a function, it will have to compute that for however many entries its scanning against. The way I have it below uses MySQL variables. I calculate it ONCE into an @StartDate and use THAT value for the join clause.

我唯一改变的另一件事是添加了"STRAIGHT_JOIN"子句.在许多情况下,我发现它可以帮助自己和其他人优化查询.它可以防止MySQL尝试以另一种方式解释查询,这可能是因为首先查看Activity表,因为它是一个较小的文件,然后再从该表进行反向链接. "STRAIGHT_JOIN"告诉优化器按照您列出的顺序进行操作.

The only additional thing I changed was adding the "STRAIGHT_JOIN" clause. On many instances, I've found that it has helped myself and others to optimize the query. It prevents MySQL from trying to interpret the query in another way by possibly looking at the Activity table first since its a smaller file and then back-linking from that one. "STRAIGHT_JOIN" tells the optimizer to do it in the order you've listed.

SELECT STRAIGHT_JOIN
      a.activity_id, 
      b.activity_name, 
      count(a.activity_id) as total_count
   FROM 
      ( select @StartDate := date_Sub( now(), interval 30 day ) sqlvars,
      friends as f
         INNER JOIN activities as a 
            on a.user_id = f.friend_id 
           and a.created_at >= @StartDate
         INNER JOIN activity as b 
            on a.activity_id = b.activity_id
   WHERE 
          f.user_id = 1 
      and f.is_approved = 1
   GROUP by 
      a.activity_id
   ORDER by 
      total_count DESC
   LIMIT 5

每个反馈

是这种情况,并具有"30天前滚动"的周期,然后我将求助于夜间创建表,除了创建用户ID,活动和计数并从中进行查询外……... /p>

That being the case, and having this "rolling 30 days ago" cycle, I would then resort to a nightly table creation that is nothing but a creation by user ID, activity and count and query from that instead...

create table DailyRollupActivity
select a.user_id,
       a.activity_id,
       count(*) total_count
   from
      ( select @StartDate := date_Sub( now(), interval 30 day ) sqlvars,
      Activities a
   where
      a.created_at >= @StartDate
   group by
      a.User_ID,
      a.Activity_ID

确保该每日汇总表上的索引(用户ID和总计数),然后根据total_count降序并限制为5的朋友ID直接查询该索引.要创建此ONCE的事件/脚本.查看当前日期的活动也有多重要.这项活动是否过于激烈,以至于一天的活动会使您原本希望呈现给用户的内容歪斜?

Ensure you have an index on this daily aggregate table by the ( user ID and total count ) then query directly to this based on the friend ID ordered by total_count descending and limit 5. Small price to pay to have a nightly trigger / event / script to be run to create this ONCE. How critical is it to see activity for the current date too. Is the activity that drastic that one day activity would skew what you otherwise want to present to the user?

这篇关于MySQL对社交好友图的优化(按好友分组)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆