Mysql子查询总是在做filesort [英] Mysql subquery always doing filesort
问题描述
我有一张带有索引的gameplatform_pricehistory表:
( id_app
, country
, dateup
)
I have a table gamesplatform_pricehistory with an index :
(id_app
,country
,dateup
)
这样做
explain select dateup from gamesplatform_pricehistory
where id_app=1 and country=1
order by dateup desc limit 1
显示使用位置;使用索引
shows "Using where; Using index"
但是带有子查询:
explain select app.id, (select dateup from gamesplatform_pricehistory
where id_app=app.id and country=1
order by dateup desc limit 1)
from app where id > 0;
显示使用where;使用索引;使用filesort
shows Using where; Using index; Using filesort
这是一个直接显示问题的sqlfiddle:
http://sqlfiddle.com/#!2/034bc/1
Here is a sqlfiddle showing directly the problem : http://sqlfiddle.com/#!2/034bc/1
基准数百万行:
(table games_platform与app相同):
benchmark with millions rows : (table games_platform is the same as app) :
SELECT sql_no_cache thepricehistory.dateup
FROM games_platform
LEFT JOIN (SELECT max(dateup) as dateup, id_app
FROM gamesplatform_pricehistory
WHERE country='229' GROUP BY id_app
) thepricehistory
ON thepricehistory.id_app =games_platform.id
WHERE games_platform.id=2
eval:0.8s
eval : 0.8s
SELECT sql_no_cache ( SELECT dateup FROM gamesplatform_pricehistory
WHERE id_app= games_platform.id AND country='229'
ORDER BY dateup DESC LIMIT 1
) AS dateup
FROM games_platform
WHERE games_platform.id=2
eval: 0.0003s
eval : 0.0003s
推荐答案
使用filesort
并不一定是件坏事。这个名字有点误导。虽然它包含文件,但并不意味着数据写在硬盘上的任何地方。它仍然只是在内存中处理。
Using filesort
isn't necessarily a bad thing. The name is a bit misleading. Though it contains "file", it does not mean, that the data is written anywhere on hard disk. It still is just processed in memory.
来自手册:
MySQL必须做一个额外的传递以找出如何检索按排序顺序排列的行。排序是通过根据连接类型遍历所有行并将排序键和指针存储到与WHERE子句匹配的所有行的行来完成的。然后对键进行排序,并按排序顺序检索行。参见章节 8.2.1.11,ORDER BY Optimization一>。
您理解为什么会在您的查询中发生这种情况,对吧?使用这种子查询是不好的样式,因为它是依赖子查询。对于 app
表中的每一行,都会执行子查询。很坏。使用 join
重写查询。
You understand why this happens in your query, right? Using this kind of subqueries is bad style since it's a dependent subquery. For every row in your app
table the subquery is executed. Very bad. Rewrite the query with a join
.
select app.id,
gp.dateup
from app
join gamesplatform_pricehistory gp on gp.id_app = app.id
where app.id > 0
and gp.country = 1
and gp.dateup = (SELECT MAX(dateup) FROM gamesplatform_pricehistory smgp WHERE smgp.id_app = gp.id_app AND smgp.country = 1)
;
这仍然使用从属子查询,但解释
看起来好多了:
This still uses a dependent subquery, but the explain
looks much better:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|----|--------------------|-------|-------|---------------|---------|---------|----------------------------|------|--------------------------|
| 1 | PRIMARY | app | index | PRIMARY | PRIMARY | 4 | (null) | 2 | Using where; Using index |
| 1 | PRIMARY | gp | ref | id_app | id_app | 5 | db_2_034bc.app.id,const | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | smgp | ref | id_app | id_app | 5 | db_2_034bc.gp.id_app,const | 1 | Using index |
另一种重写方法是:
select app.id,
gp.dateup
from app
LEFT join
(SELECT id_app, MAX(dateup) AS dateup
FROM gamesplatform_pricehistory
WHERE country = 1
GROUP BY id_app
)gp on gp.id_app = app.id
where app.id > 0
;
解释看起来更好:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|----|-------------|----------------------------|-------|---------------|---------|---------|--------|------|--------------------------|
| 1 | PRIMARY | app | index | PRIMARY | PRIMARY | 4 | (null) | 2 | Using where; Using index |
| 1 | PRIMARY | <derived2> | ALL | (null) | (null) | (null) | (null) | 2 | |
| 2 | DERIVED | gamesplatform_pricehistory | index | (null) | id_app | 13 | (null) | 2 | Using where; Using index |
这是一个完全没有依赖子查询的版本:
And here is a version where you have no dependent subquery at all:
select app.id,
gp.dateup
from app
left join gamesplatform_pricehistory gp on gp.id_app = app.id and country = 1
left join gamesplatform_pricehistory gp2 on gp.id_app = app.id and country = 1 and gp.dateup < gp2.dateup
where app.id > 0
and gp2.dateup is null
;
它的工作方式如下: gp.dateup
处于最大值,没有 gp2.dateup
。
It works like this: When gp.dateup
is at its maximum, there is no gp2.dateup
.
这篇关于Mysql子查询总是在做filesort的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!