如何根据EXPLAIN计划优化MySQL查询 [英] How to optimise MySQL queries based on EXPLAIN plan

查看:69
本文介绍了如何根据EXPLAIN计划优化MySQL查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

看看查询的EXPLAIN计划,如何确定最佳的优化方式?

Looking at a query's EXPLAIN plan, how does one determine where optimisations can best be made?

我很高兴要检查的第一件事是是否使用了良好的索引,但是除此之外,我有些困惑.过去,通过反复试验,我有时发现执行连接的顺序可以很好地改进,但是如何通过查看执行计划来确定这一点呢?

I appreciate that one of the first things to check is whether good indexes are being used, but beyond that I'm a little stumped. Through trial and error in the past I have sometimes found that the order in which joins are conducted can be a good source of improvement, but how can one determine that from looking at the execution plan?

尽管我非常想对如何优化查询有一个很好的一般理解(建议读者多加赞赏!),但我也意识到,讨论具体案例通常比抽象讨论要容易得多.由于我目前正在用这种方法将我的头撞在墙上,因此您的想法将不胜感激:

Whilst I would very much like to gain a good general understanding of how to optimise queries (suggested reading much appreciated!), I also realise that it's often easier to discuss concrete cases than talk in the abstract. Since I am currently banging my head against the wall with this one, your thoughts would be much appreciated:


id   select_type   table   type     possible_keys    key       key_len   ref                    rows   Extra
 1   SIMPLE        S       const    PRIMARY,l,p,f4   PRIMARY         2   const                     1   Using temporary
 1   SIMPLE        Q       ref      PRIMARY,S        S               2   const                   204   Using index
 1   SIMPLE        V       ref      PRIMARY,n,Q      Q               5   const,db.Q.QID            6   Using where; Using index; Distinct
 1   SIMPLE        R1      ref      PRIMARY,L        L             154   const,db.V.VID          447   Using index; Distinct
 1   SIMPLE        W       eq_ref   PRIMARY,w        PRIMARY         5   const,db.R.RID,const      1   Using where; Distinct
 1   SIMPLE        R2      eq_ref   PRIMARY,L        PRIMARY       156   const,db.W.RID,const      1   Using where; Distinct

我正确地解释执行计划的最后一行,如下所示:

Am I correct in interpreting the final row of the execution plan as follows:

  • 由于它的主键完全匹配,因此每个输出行只需要获取一行R2
  • 但是,然后根据适用于R2的某些条件过滤此类输出行?
  • as it is fully matched on its primary key, only one row of R2 need be fetched per output row;
  • however, such output rows are then filtered based on some criteria that applies to R2?

如果是这样,我的问题就在于在最后一步中进行的过滤.如果条件导致不进行过滤(例如WHERE `Col_1_to_3` IN (1,2,3)),则查询将运行得非常快(〜50ms);但是,如果条件限制了选择的行(WHERE `Col_1_to_3` IN (1,2)),则查询将花费更长的时间(〜5s).如果限制为单个匹配项(WHERE `Col_1_to_3` IN (1)),则优化程序会建议一个完全不同的执行计划(其执行效果略好于5s,但仍比50ms差很多).似乎没有一个更好的索引可以在该表上使用(假设已经完全使用主键为每个结果返回一行了?).

If so, my problem lies in the filtering which occurs in that final step. If the condition results in no filtering (e.g. WHERE `Col_1_to_3` IN (1,2,3)), the query runs extremely quickly (~50ms); however, if the condition restricts the rows selected (WHERE `Col_1_to_3` IN (1,2)), the query takes considerably longer (~5s). If the restriction is to a single match (WHERE `Col_1_to_3` IN (1)), the optimiser suggests an altogether different execution plan (which performs marginally better than 5s, but still a lot worse than 50ms). It doesn't seem as though there's a better index that can be used on that table (given it's already fully using the primary key to return one row per result?).

一个人应该如何解释所有这些信息?我猜对了吗,因为要在要连接的最终表上进行这种输出筛选,所以与早先连接表并更快地过滤此类行相比,这会浪费大量的精力吗?如果是这样,如何确定执行计划中何时应加入R2?

How should one interpret all this information? Am I right in guessing that, because such output filtering is taking place on the final table to be joined, considerable effort is wasted versus joining the table earlier and filtering such rows sooner? If so, how does one determine when in the execution plan R2 ought to be joined?

虽然我拒绝包含查询&这里有完整的模式(因为我真的很可能知道要寻找什么,而不仅仅是被告知答案),我知道有必要推进讨论:

Whilst I resisted including the query & schema in full here (as I would really likely to know what to look for, not merely be told the answer), I understand it's necessary to advance the discussion:

SELECT DISTINCT
    `Q`.`QID`
FROM
    `S`
    NATURAL JOIN `Q`
    NATURAL JOIN `V`
    NATURAL JOIN `R` AS `R1`
    NATURAL JOIN `W`

    JOIN `R` AS `R2` ON (
            `R2`.`SID` = `S`.`SID`
        AND `R2`.`RID` = `R1`.`RID`
        AND `R2`.`VID` = `S`.`V_id`
        AND `R2`.`Col_1_to_3` IN (1,2) -- this is where performance suffers!
    )

WHERE
    AND `S`.`SID` = @x
    AND `W`.`WID` = @y
;

R的定义是:

CREATE TABLE `R` (
  `SID` smallint(6) unsigned NOT NULL,
  `RID` smallint(6) unsigned NOT NULL,
  `VID` varchar(50) NOT NULL DEFAULT '',
  `Col_1_to_3` smallint(1) DEFAULT NULL,
  `T` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`SID`,`RID`,`VID`),
  KEY `L` (`SID`,`VID`,`Col_1_to_3`),
  CONSTRAINT `R_f1` FOREIGN KEY (`SID`) REFERENCES `S` (`SID`),
  CONSTRAINT `R_f2` FOREIGN KEY (`SID`, `VID`) REFERENCES `V` (`SID`, `VID`),
  CONSTRAINT `R_f3` FOREIGN KEY (`SID`, `VID`, `Col_1_to_3`) REFERENCES `L` (`SID`, `VID`, `LID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

推荐答案

取决于要使用的内容和查询的内容.

Depends on what you're going for and what the query is.

通常,对于EXPLAIN中具有Using where的每一行,都需要使用索引(possible keyskeys列)进行设置.这些是您的过滤器,包括WHERE和ON.说Using index更好.这意味着有一个覆盖索引,MySQL可以直接从索引中检索数据,而不必访问表数据中的行.

Generally, for every line in EXPLAIN that has a Using where, you need to have it using an index (possible keys and keys column). These are your filters and include WHERE and ON. Having it say Using index is even better. It means there's a covering index, and MySQL can retrieve the data right from the index rather than visiting the row in the table data.

应该查看没有Using where且正在返回大量行的行.这些是表中所有行的返回值.我不知道您的查询是什么,所以我不知道是否在这里被提醒.尝试过滤结果集以减小大小并提高性能.

The lines where there is no Using where, and it is returning a large number of rows should be looked at. These are returning values for all rows in the table. I don't know what your query is, so I don't know whether to be alarmed here. Try filtering the result set to reduce the size and improve performance.

您通常应该尝试避免看到Using filesortUsing temporary,尽管只有在您不期望它们的情况下它们才是不好的.

You generally should try to avoid seeing Using filesort or Using temporary, though those are only bad if you're not expecting them.

文件排序通常与ORDER子句一起出现.通常,您希望MySQL使用覆盖索引(Using index),以便已按顺序从服务器返回行.如果不是,则MySQL必须在以后使用文件排序对其进行排序.

Filesort usually appears with the ORDER clause. You generally want MySQL to use a covering index (Using index) so that the rows are returned already in order from the server. If they're not, then MySQL must order them afterward, using filesort.

Using temporary引用派生表时可能会很糟糕,因为它们没有索引.看来您已经明确创建了一个带有索引的临时表,因此在这里还不错.有时,您唯一的选择是使用派生表,因此使用Using temporary.

Using temporary can be bad when it refers to derived tables because they don't have indexes. It seems that you've explicitly created a temporary table with indexes, so here, it's not bad. Sometimes, your only choice is to use a derived table, and hence Using temporary.

这篇关于如何根据EXPLAIN计划优化MySQL查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆