为什么MySQL JOIN显着快于WHERE IN(子查询) [英] why is MySQL JOIN significantly faster than WHERE IN (subquery)

查看:582
本文介绍了为什么MySQL JOIN显着快于WHERE IN(子查询)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图更好地理解为什么此查询优化如此重要(快100倍以上),因此我可以将类似的逻辑重用于其他查询.

I am trying to better understand why this query optimization is so significant (over 100 times faster) so I can reuse similar logic for other queries.

使用MySQL 4.1-在可以一致地重现所有查询和结果时间之前,已完成RESET QUERY CACHE和FLUSH TABLES.对我而言,在EXPLAIN上唯一显而易见的事情是,在JOIN期间仅需查找5行?但这是速度的全部答案吗?这两个查询都使用部分索引(forum_stickies)来确定已删除主题的状态(topic_status = 0)

Using MySQL 4.1 - RESET QUERY CACHE and FLUSH TABLES was done before all queries and result time can be reproduced consistently. Only thing that is obvious to me on the EXPLAIN is that only 5 rows have to be found during the JOIN ? But is that the whole answer to the speed? Both queries are using a partial index (forum_stickies) to determine deleted topics status (topic_status=0)

使用EXPLAIN进行更深入分析的屏幕截图

Screenshots for deeper analysis with EXPLAIN

慢查询:0.7+秒(清除缓存)

slow query: 0.7+ seconds (cache cleared)

SELECT SQL_NO_CACHE forum_id, topic_id FROM bb_topics 
WHERE topic_last_post_id IN 
(SELECT SQL_NO_CACHE  MAX (topic_last_post_id) AS topic_last_post_id
FROM bb_topics WHERE topic_status=0 GROUP BY forum_id)

快速查询:0.004秒或更短(清除缓存)

fast query: 0.004 seconds or less (cache cleared)

SELECT SQL_NO_CACHE forum_id, topic_id FROM bb_topics AS s1 
JOIN 
(SELECT SQL_NO_CACHE MAX(topic_last_post_id) AS topic_last_post_id
FROM bb_topics WHERE topic_status=0 GROUP BY forum_id) AS s2 
ON s1.topic_last_post_id=s2.topic_last_post_id  

请注意,最重要的列(topic_last_post_id)上没有索引,但是这无济于事(无论如何,结果都会存储以供重复使用).

Note there is no index on the most important column (topic_last_post_id) but that cannot be helped (results are stored for repeated use anyway).

答案仅仅是因为第一个查询必须扫描topic_last_post_id TWICE,第二次才将结果与子查询进行匹配吗?如果是这样,为什么它指数级地变慢?

Is the answer simply because the first query has to scan topic_last_post_id TWICE, the second time to match up the results to the subquery? If so, why is it exponentially slower?

(不太重要,我很好奇,如果我确实在topic_last_post_id上添加了索引,为什么第一个查询仍然需要这么长时间)

(less important I am curious why the first query still takes so long if I actually do put an index on topic_last_post_id)

更新:经过大量搜索之后,我在stackoverflow上找到了该线程,该线程进入了本主题子查询与联接

update: I found this thread on stackoverflow after much searching later on which goes into this topic Subqueries vs joins

推荐答案

也许引擎为bb_topics中的每一行执行子查询,只是为了看看它是否在结果中找到topic_last_post_id.会很愚蠢,但也可以解释两者之间的巨大差异.

Maybe the engine executes the subquery for every row in bb_topics, just to see if it finds the topic_last_post_id in the results. Would be stupid, but would also explain the huge difference.

这篇关于为什么MySQL JOIN显着快于WHERE IN(子查询)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆