优化两个大表上的简单查询 [英] Optimizing a simple query on two large tables

查看：135 发布时间：2020/5/21 21:16:55 mysql performance optimization query-optimization

本文介绍了优化两个大表上的简单查询的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试提供一项功能，以显示好友最多查看的页面.我的好友表有570万行，而视图表有530万行.此刻，我只想在这两个表上运行查询，并找到一个人的朋友查看的20个浏览量最高的页面ID.

I'm trying to offer a feature where I can show pages most viewed by friends. My friends table has 5.7M rows and the views table has 5.3M rows. At the moment I just want to run a query on these two tables and find the 20 most viewed page id's by a person's friend.

这是我现在拥有的查询:

Here's the query as I have it now:

SELECT page_id 
FROM `views` INNER JOIN `friendships` ON friendships.receiver_id = views.user_id 
WHERE (`friendships`.`creator_id` = 143416) 
GROUP BY page_id 
ORDER BY count(views.user_id) desc 
LIMIT 20

这是解释的样子:

+----+-------------+-------------+------+-----------------------------------------+---------------------------------+---------+-----------------------------------------+------+----------------------------------------------+
| id | select_type | table       | type | possible_keys                           | key                             | key_len | ref                                     | rows | Extra                                        |
+----+-------------+-------------+------+-----------------------------------------+---------------------------------+---------+-----------------------------------------+------+----------------------------------------------+
|  1 | SIMPLE      | friendships | ref  | PRIMARY,index_friendships_on_creator_id | index_friendships_on_creator_id | 4       | const                                   |  271 | Using index; Using temporary; Using filesort | 
|  1 | SIMPLE      | views       | ref  | PRIMARY                                 | PRIMARY                         | 4       | friendships.receiver_id                 |   11 | Using index                                  | 
+----+-------------+-------------+------+-----------------------------------------+---------------------------------+---------+-----------------------------------------+------+----------------------------------------------+

views表的主键为(user_id，page_id)，您可以看到正在使用它.友谊表的主键为(receiver_id，creator_id)，辅助索引为(creator_id).

The views table has a primary key of (user_id, page_id), and you can see this is being used. The friendships table has a primary key of (receiver_id, creator_id), and a secondary index of (creator_id).

如果我在不使用group by和limit的情况下运行此查询，则该特定用户大约有25,000行-通常.

If I run this query without the group by and limit, there's about 25,000 rows for this particular user - which is typical.

在最近一次的实际运行中，该查询执行还花费了7秒钟的时间，对于在网络应用中获得不错的响应而言，这太长了.

On the most recent real run, this query took 7 seconds too execute, which is way too long for a decent response in a web app.

我想知道的一件事是我是否应该将二级索引调整为(creator_id，receiver_id).我不确定这是否会带来很大的性能提升.今天，我可能会根据对这个问题的答案进行尝试.

One thing I'm wondering is if I should adjust the secondary index to be (creator_id, receiver_id). I'm not sure that will give much of a performance gain though. I'll likely try it today depending on answers to this question.

您能看到以什么方式重写查询以使其变得轻快吗?

Can you see any way the query can be rewritten to make it lightening fast?

更新:我需要对其进行更多测试，但是如果我不对数据库进行分组和排序，但后来在ruby中进行排序，则看来我的讨厌查询会更好.总体时间要短得多-似乎减少了80％.也许我的早期测试存在缺陷-但这绝对值得进行更多调查.如果是真的-那么wtf是Mysql在做什么?

Update: I need to do more testing on it, but it appears my nasty query works out better if I don't do the grouping and sorting in the db, but do it in ruby afterwards. The overall time is much shorter - by about 80% it seems. Perhaps my early testing was flawed - but this definitely warrants more investigation. If it's true - then wtf is Mysql doing?

优化两个大表上的简单查询 [英] Optimizing a simple query on two large tables

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录关闭

优化两个大表上的简单查询 [英] Optimizing a simple query on two large tables

问题描述

推荐答案

相关文章

数据库最新文章

热门教程

热门工具

登录 关闭

登录关闭