为什么MAX()比ORDER BY ... LIMIT 1慢100倍? [英] Why is MAX() 100 times slower than ORDER BY ... LIMIT 1?

查看:527
本文介绍了为什么MAX()比ORDER BY ... LIMIT 1慢100倍?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表foo,其中(另外20个)列barbazquux列在bazquux上.该表有约50万行.

I have a table foo with (among 20 others) columns bar, baz and quux with indexes on baz and quux. The table has ~500k rows.

为什么下面的查询速度差异如此之大?查询A花费0.3s,而查询B花费28s.

Why do the following to queries differ so much in speed? Query A takes 0.3s, while query B takes 28s.

查询A

select baz from foo
    where bar = :bar
    and quux = (select quux from foo where bar = :bar order by quux desc limit 1)

说明

id  select_type table   type    possible_keys   key     key_len ref     rows    Extra
1   PRIMARY     foo     ref     quuxIdx         quuxIdx 9       const   2       "Using where"
2   SUBQUERY    foo     index   NULL            quuxIdx 9       NULL    1       "Using where"

查询B

select baz from foo
    where bar = :bar
    and quux = (select MAX(quux) from foo where bar = :bar)

说明

id  select_type table   type    possible_keys   key     key_len ref     rows    Extra
1   PRIMARY     foo     ref     quuxIdx         quuxIdx 9       const   2       "Using where"
2   SUBQUERY    foo     ALL     NULL            NULL    NULL    NULL    448060  "Using where"

我使用MySQL 5.1.34.

I use MySQL 5.1.34.

推荐答案

您应在(bar, quux)上添加索引.

没有该索引,MySQL无法查看如何有效执行查询,因此必须从各种效率低下的查询计划中进行选择.

Without this index, MySQL can't see how to perform the query efficiently so it has to choose from various inefficient query plans.

在第一个示例中,它扫描quux索引,并为找到的每一行在原始表中查找bar的对应值.检查每一行的时间是原来的两倍,但是幸运的是,具有正确值bar的行接近其扫描开始,因此可以停止.这可能是因为您要搜索的bar的值频繁出现,因此幸运的机会非常高.结果,它可能只需要在找到匹配项之前检查几行,因此即使检查每行花费的时间是原来的两倍,但仅检查几行这一事实可以节省大量的总成本.由于您在bar上没有索引,因此MySQL事先不知道值:bar经常出现,因此它不知道此查询会很快.

In the first example it scans the quux index and for each row found, looks up the corresponding value of bar in the original table. This takes twice as long to check each row, but it gets lucky that a row that has the correct value of bar is near the beginning of its scan and so it can stop. This could be because the value of bar you are searching for occurs frequently, so the chance of being lucky is very high. As a result it may only have to examine a handful of rows before it finds a match, so even though it takes twice as long to check each row, the fact that only a few rows are checked gives a massive overall saving. Since you have no index on bar, MySQL doesn't know in advance that the value :bar occurs frequently so it cannot know that this query will be fast.

在第二个示例中,它使用不同的计划,在该计划中始终扫描整个表.每行直接从表中读取,无需使用索引.这意味着每一行的读取速度很快,但是由于您有很多行,因此整体读取速度很慢.如果没有任何行与:bar匹配,则这将是更快的查询计划.但是,如果大约1%的行具有所需的bar值,则使用此查询计划的速度将比上面的计划慢(非常)大约100倍.由于您在bar上没有索引,因此MySQL不会事先知道.

In the second example it uses a different plan where it always scans the whole table. Each row is read directly from the table, without usage of an index. This means that each row read is fast, but because you have a lot of rows, it is slow overall. If none of the rows matched on :bar this would be the faster query plan. But if roughly 1% of rows have the desired value of bar, it will be (very) approximately 100 times slower to use this query plan compared to the above plan. Since you have no index on bar, MySQL doesn't know this in advance.

您也可以只添加缺少的索引,然后两个查询的执行速度将快很多.

You could also just add the missing index and then both queries will go much faster.

这篇关于为什么MAX()比ORDER BY ... LIMIT 1慢100倍?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆