优化mysql查询(喜欢/不喜欢) [英] Optimizing mysql query (likes/dislikes)

查看:138
本文介绍了优化mysql查询(喜欢/不喜欢)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的网站包含用户可以投票的内容(类似/不喜欢类似于reddit upvotes)。当选择单个内容时,我运行以下子查询来获取喜欢数,不喜欢数和当前用户的投票。



投票存储在单独的表{contentId,userId,vote}

  SELECT 

[... BUNCH OF FIELDS ...]

(SELECT COUNT(*)FROM votes vt WHERE vt.cId = c.contentId AND vote ='.Constants :: LIKE。')AS like,
(SELECT COUNT(*)FROM votes vt WHERE vt.cId = c.contentId AND vote ='.Constants :: DISLIKE。')AS dislikes,
COALESCE((SELECT vote FROM votes vt WHERE vt.cId = c.contentId AND userId ='.USER_ID。'),'.Constants :: NO_VOTE。')AS myVote

FROM content

[... OTHER STUFF ...]有没有更好的方法来实现这一点(组合这些子查询或其他)?

$ b

$ b

解决方案

在性能方面,这些相关的子查询可以吃你的午餐。并且吞噬你的午餐盒,对于大集合,因为MySQL处理他们的方式。对外部查询中返回的每一行执行这些子查询中的每一个。对于大集合,这可能会非常昂贵。



另一种方法是使用内联视图来实现所有内容的喜欢和不喜欢,然后执行联接操作



但是,这种方法也很昂贵,特别是当你只需要对几个内容行进行投票计数 。通常,外部查询中有一个谓词,也可以合并到内联视图中,以限制需要检查和返回的行数。



我们想要使用OUTER连接到该内联视图,因此它返回等同于您的查询的结果;在投票表中没有匹配的行时,从 content 返回一行。

  SELECT [... BUNCH OF FIELDS ...] 
,COALESCE(v.likes,0)AS like
,COALESCE v.dislikes,0)AS不喜欢
,COALESCE(v.myvote,'。Constants :: NO_VOTE。')AS myvote
从内容c
LEFT
JOIN .cId
,SUM(vt.vote ='.Constants :: LIKE。')AS like
,SUM(vt.vote ='.Constants :: DISLIKE。')AS dislikes
,MAX(IF(vt.userId ='.USER_ID。',vt.vote,NULL)AS myvote
FROM votes vt
GROUP
BY vt.cId
)v
ON v.cId = c.contentId

[... OTHER STUFF ...]

请注意,内联视图查询(别名为 v )将查看 votes 表。如果只需要一个子集,那么可以考虑添加一个适当的谓词(在WHERE子句中或作为JOIN到另一个表)。在您的查询中没有来自 [... OTHER STUFF ...] 的指示,它是否只返回内容中的几行或如果您需要所有行,因为您正在按喜欢等顺序。



content 表中选择的少量行使用相关的子查询(如在您的查询中)实际上可能比实现巨大的内联视图和执行连接操作。



哦...对于这两个查询,不用说, votes 具有 cId 的引导列的表将有益于性能。对于内联视图,您不希望MySQL的开销必须对所有这些行执行 filesort 操作来执行GROUP BY。对于相关的子查询,您希望他们使用索引范围扫描,而不是全扫描。


My website contains pieces of content on which users can vote (like/dislike similar to reddit upvotes). When selecting an individual piece of content, I run the following subqueries to get the number of likes, the number of dislikes and the current user's vote.

The votes are stored in a separate table {contentId, userId, vote}

SELECT

[... BUNCH OF FIELDS ...]

(SELECT COUNT(*) FROM votes vt WHERE vt.cId = c.contentId AND vote = '.Constants::LIKE.') AS likes,
(SELECT COUNT(*) FROM votes vt WHERE vt.cId = c.contentId AND vote = '.Constants::DISLIKE.') AS dislikes,
COALESCE((SELECT vote FROM votes vt WHERE vt.cId = c.contentId AND userId = '.USER_ID.'), '.Constants::NO_VOTE.') AS myVote

FROM content

[... OTHER STUFF ... ]

Is there a better way to achieve this (combine those subqueries or otherwise)?

解决方案

In terms of performance, those correlated subqueries can eat your lunch. And devour your lunchbox too, for large sets, because of the way MySQL processes them. Each of those subqueries gets executed for every row returned in the outer query. And that can get very expensive for large sets.

An alternative approach is to use an inline view to materialize the likes and dislikes for all content, and then do a join operation to that.

But, this approach can be expensive too, particularly when you are only needing the vote "counts" for just a few content rows, out of a bazillion rows. Often, there is a predicate from the outer query that can also be incorporated into the inline view, to limit the number of rows that need to be examined and returned.

We want to use an OUTER join to that inline view, so it returns a result equivalent to your query; returning a row from content when there are no matching rows in the vote table.

SELECT [... BUNCH OF FIELDS ...]
     , COALESCE(v.likes,0) AS likes
     , COALESCE(v.dislikes,0) AS dislikes
     , COALESCE(v.myvote,'.Constants::NO_VOTE.') AS myvote
  FROM content c
  LEFT
  JOIN ( SELECT vt.cId
              , SUM(vt.vote = '.Constants::LIKE.') AS likes
              , SUM(vt.vote = '.Constants::DISLIKE.') AS dislikes
              , MAX(IF(vt.userId = '.USER_ID.',vt.vote,NULL) AS myvote
           FROM votes vt
          GROUP
             BY vt.cId
       ) v
    ON v.cId = c.contentId

       [... OTHER STUFF ... ]

Note that the inline view query (aliased as v) is going to look at EVERY single row from the votes table. If you only need a subset, then consider adding an appropriate predicate (either in a WHERE clause or as a JOIN to another table). There's no indication from the [... OTHER STUFF ...] in your query whether it's returning just a few rows from content or if you are needing all of the rows because you are ordering by likes, etc.

For a small number of rows selected from the content table, using the correlated subqueries (like in your query) can actually be faster than materializing a huge inline view and performing a join operation against it.

Oh... and for both queries, it goes without saying that an appropriate index on the votes table with a leading column of cId will benefit performance. For the inline view, you don't want the overhead of MySQL having to perform a filesort operation on all of those rows to do the GROUP BY. And for the correlated subqueries, you want them to use a index range scan, not a full scan.

这篇关于优化mysql查询(喜欢/不喜欢)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆