为什么在查询中使用LIMIT时MySQL变慢? [英] Why is MySQL slow when using LIMIT in my query?

查看:838
本文介绍了为什么在查询中使用LIMIT时MySQL变慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图弄清楚为什么我的查询之一变慢了,如何解决它,但是我对结果有些困惑.

I'm trying to figure out why is one of my query slow and how I can fix it but I'm a bit puzzled on my results.

我有一个约80列和775179行的orders表,并且正在执行以下请求:

I have an orders table with around 80 columns and 775179 rows and I'm doing the following request :

SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL ORDER BY creation_date DESC LIMIT 200

在4.5秒内返回38行

which returns 38 rows in 4.5s

删除ORDER BY时,我得到了很好的改进:

When removing the ORDER BY I'm getting a nice improvement :

SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL LIMIT 200

38行

但是在不触摸ORDER BY的情况下删除LIMIT时,我得到的效果更好:

But when removing the LIMIT without touching the ORDER BY I'm getting an even better result :

SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL ORDER BY creation_date DESC

0.10s(??)中的38行

38 rows in 0.10s (??)

为什么我的极限太饿了?

Why is my LIMIT so hungry ?

更进一步

在发送答案之前,我注意到我在creation_date(它是一个datetime)上有一个索引,然后尝试了一些事情,因此我删除了它,并且第一个查询现在以0.10s的速度运行.为什么会这样?

I was trying a few things before sending my answer and after noticing that I had an index on creation_date (which is a datetime) I removed it and the first query now runs in 0.10s. Why is that ?

编辑

好的猜测,我在where的其他列上有索引.

Good guess, I have indexes on the others columns part of the where.

mysql> explain SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL ORDER BY creation_date DESC LIMIT 200;
+----+-------------+--------+-------+------------------------+---------------+---------+------+------+-------------+
| id | select_type | table  | type  | possible_keys          | key        | key_len | ref  | rows | Extra       |
+----+-------------+--------+-------+------------------------+---------------+---------+------+------+-------------+
|  1 | SIMPLE      | orders | index | id_state_idx,id_mp_idx | creation_date | 5       | NULL | 1719 | Using where |
+----+-------------+--------+-------+------------------------+---------------+---------+------+------+-------------+

组1行(0.00秒)

mysql> explain SELECT * FROM orders WHERE id_state = 2 AND id_mp IS NOT NULL ORDER BY creation_date DESC;
+----+-------------+--------+-------+------------------------+-----------+---------+------+-------+----------------------------------------------------+
| id | select_type | table  | type  | possible_keys          | key       | key_len | ref  | rows  | Extra                                              |
+----+-------------+--------+-------+------------------------+-----------+---------+------+-------+----------------------------------------------------+
|  1 | SIMPLE      | orders | range | id_state_idx,id_mp_idx | id_mp_idx | 3       | NULL | 87502 | Using index condition; Using where; Using filesort |
+----+-------------+--------+-------+------------------------+-----------+---------+------+-------+----------------------------------------------------+

推荐答案

索引不一定会提高性能.为了更好地了解正在发生的情况,如果您将explain包括在不同的查询中,将会有所帮助.

Indexes do not necessarily improve performance. To better understand what is happening, it would help if you included the explain for the different queries.

我最好的猜测是,您在id_state甚至id_state, id_mp中都有一个索引,可以用来满足where子句.如果是这样,则不带order by的第一个查询将使用此索引.它应该很快.即使没有索引,这也需要对orders表中的页面进行顺序扫描,这仍然可以非常快.

My best guess would be that you have an index in id_state or even id_state, id_mp that can be used to satisfy the where clause. If so, the first query without the order by would use this index. It should be pretty fast. Even without an index, this requires a sequential scan of the pages in the orders table, which can still be pretty fast.

然后,当您在creation_date上添加索引时,MySQL决定对order by使用该索引.这需要读取索引中的每一行,然后获取相应的数据页以检查where条件并返回列(如果存在匹配项).该读取效率极低,因为它不是按页面"顺序,而是按索引指定的顺序.随机读取可能效率很低.

Then when you add the index on creation_date, MySQL decides to use that index instead for the order by. This requires reading each row in the index, then fetching the corresponding data page to check the where conditions and return the columns (if there is a match). This reading is highly inefficient, because it is not in "page" order but rather as specified by the index. Random reads can be quite inefficient.

更糟糕的是,即使您拥有limit,您仍然必须读取 entire 表,因为需要整个结果集.尽管您已经保存了38条记录的排序,但是您创建了一个效率低下的查询.

Worse, even though you have a limit, you still have to read the entire table because the entire result set is needed. Although you have saved a sort on 38 records, you have created a massively inefficient query.

顺便说一句,如果orders表无法容纳在可用内存中,这种情况将变得更加严重.然后,您有一个称为崩溃"的条件,其中每个新记录都倾向于生成一个新的I/O读取.因此,如果一个页面上有100条记录,则该页面可能必须被读取100次.

By the way, this situation gets significantly worse if the orders table does not fit in available memory. Then you have a condition called "thrashing", where each new record tends to generate a new I/O read. So, if a page has 100 records on it, the page might have to be read 100 times.

通过在orders(id_state, id_mp, creation_date)上建立索引,可以使所有这些查询的运行速度更快. where子句将使用前两列,而order by将使用最后两列.

You can make all these queries run faster by having an index on orders(id_state, id_mp, creation_date). The where clause will use the first two columns and the order by will use the last.

这篇关于为什么在查询中使用LIMIT时MySQL变慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆