查询花了将近两秒但只匹配两行 - 为什么索引没有帮助? [英] Query takes almost two seconds but matches only two rows - why isn't the index helping?

查看:106
本文介绍了查询花了将近两秒但只匹配两行 - 为什么索引没有帮助?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

表:

CREATE TABLE `Alarms` (
  `AlarmId` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `DeviceId` BINARY(16) NOT NULL,
  `Code` BIGINT(20) UNSIGNED NOT NULL,
  `Ended` TINYINT(1) NOT NULL DEFAULT '0',
  `NaturalEnd` TINYINT(1) NOT NULL DEFAULT '0',
  `Pinned` TINYINT(1) NOT NULL DEFAULT '0',
  `Acknowledged` TINYINT(1) NOT NULL DEFAULT '0',
  `StartedAt` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
  `EndedAt` TIMESTAMP NULL DEFAULT NULL,
  `MarkedForDeletion` TINYINT(1) NOT NULL DEFAULT '0',
  PRIMARY KEY (`AlarmId`),
  KEY `Key1` (`Ended`,`Acknowledged`),
  KEY `Key2` (`Pinned`),
  KEY `Key3` (`DeviceId`,`Pinned`),
  KEY `Key4` (`DeviceId`,`StartedAt`,`EndedAt`),
  KEY `Key5` (`DeviceId`,`Ended`,`EndedAt`),
  KEY `Key6` (`MarkedForDeletion`),

  KEY `KeyB` (`MarkedForDeletion`,`DeviceId`,`StartedAt`,`EndedAt`,`Acknowledged`,`Pinned`)
) ENGINE=INNODB;

目前它有大约300万行。

It currently has about three million rows in it.

查询:

SELECT
COUNT(`AlarmId`) AS `n`
FROM `Alarms`
WHERE `StartedAt` < FROM_UNIXTIME(1519101900)
AND (`EndedAt` IS NULL OR `EndedAt` > FROM_UNIXTIME(1519101900))
AND `DeviceId` = UNHEX('00030000000000000000000000000000')
AND `MarkedForDeletion` = FALSE
AND (
     (`Alarms`.`EndedAt` IS NULL AND `Alarms`.`Acknowledged` = FALSE)
  OR ( `Alarms`.`EndedAt` IS NOT NULL AND `Alarms`.`Pinned` = TRUE)
)

查询计划:

id      select_type     table   type    possible_keys   key     key_len ref     rows    Extra
1       SIMPLE  Alarms  range   Key2,Key3,Key4,Key5,Key6,KeyB   KeyB    21              1574778 Using where; Using index

经过时间:1,763,222μs

Elapsed time: 1,763,222μs

在这种特殊情况下,查询(正确)甚至不匹配很多行(结果是 n = 2 )。

In this particular case the query (correctly) doesn't even match many rows (the result is n = 2).

从我使用索引合并学到的东西(虽然我还没有那么正确),我尝试重新组织条件(原始是由一些C ++生成的,基于输入条件,因此奇怪的运算符分配):

Taking what I learnt from working with index merges (though I still haven't got that right), I tried reorganising the conditions a bit (the original was generated by some C++, based on input conditions, hence the strange operator distribution):

SELECT COUNT(`AlarmId`) AS `n`
FROM `Alarms`
WHERE 
(
    `EndedAt` IS NULL
    AND `Acknowledged` = FALSE

    AND `StartedAt` < FROM_UNIXTIME(1519101900)
    AND `MarkedForDeletion` = FALSE
    AND `DeviceId` = UNHEX('00030000000000000000000000000000')
) OR (
    `EndedAt` > FROM_UNIXTIME(1519101900)
    AND `Pinned` = TRUE

    AND `StartedAt` < FROM_UNIXTIME(1519101900)
    AND `MarkedForDeletion` = FALSE
    AND `DeviceId` = UNHEX('00030000000000000000000000000000')
);

…但结果是一样的。

…but the result is the same.

那为什么需要这么长时间?如何修改它/索引以使其立即生效?

So why does it take so long? How can I modify it / the indexes to make it work instantly?

推荐答案


  • OR 非常难以优化。

  • MySQL几乎从不在单个查询中使用两个索引。

    • OR is notoriously hard to optimize.
    • MySQL almost never uses two indexes in a single query.
    • 要避免这两种情况,请将转换为 UNION 。每个 SELECT 都可以使用不同的索引。因此,为每个人建立一个最优的 INDEX

      To avoid both of those, turn OR into UNION. Each SELECT can use its a different index. So, build an optimal INDEX for each.

      实际上,因为你只做 COUNT ,您也可以评估两个单独的计数并添加它们。

      Actually, since you are only doing COUNT, you may as well evaluate two separate counts and add them.

      SELECT ( SELECT  COUNT(*)
                  FROM  `Alarms`
                  WHERE  `EndedAt` IS NULL
                    AND  `Acknowledged` = FALSE
                    AND  `StartedAt` < FROM_UNIXTIME(1519101900)
                    AND  `MarkedForDeletion` = FALSE
                    AND  `DeviceId` = UNHEX('00030000000000000000000000000000' )
             ) + 
             ( SELECT  COUNT(*)
                  FROM  `Alarms`
                  WHERE  `EndedAt` > FROM_UNIXTIME(1519101900)
                    AND  `Pinned` = TRUE
                    AND  `StartedAt` < FROM_UNIXTIME(1519101900)
                    AND  `MarkedForDeletion` = FALSE
                    AND  `DeviceId` = UNHEX('00030000000000000000000000000000')
             ) AS `n`;
      
      INDEX(DeviceId, Acknowledged, MarkedForDeletion, EndedAt, StartedAt) -- for first
      INDEX(DeviceId, Pinned, MarkedForDeletion, EndedAt, StartedAt) -- for second
      INDEX(DeviceId, Pinned, MarkedForDeletion, StartedAt, EndedAt) -- for second
      

      嗯,这不起作用如果有重叠。所以,让我们回到 UNION 模式:

      Well, that won't work if there is overlap. So, let's go back to the UNION pattern:

      SELECT  COUNT(*) AS `n`
          FROM
          (
              ( SELECT  AlarmId
                  FROM  `Alarms`
                  WHERE  `EndedAt` IS NULL
                    AND  `Acknowledged` = FALSE
                    AND  `StartedAt` < FROM_UNIXTIME(1519101900)
                    AND  `MarkedForDeletion` = FALSE
                    AND  `DeviceId` = UNHEX('00030000000000000000000000000000')  
              )
              UNION DISTINCT
              ( SELECT  AlarmId
                  FROM  `Alarms`
                  WHERE  `EndedAt` > FROM_UNIXTIME(1519101900)
                    AND  `Pinned` = TRUE
                    AND  `StartedAt` < FROM_UNIXTIME(1519101900)
                    AND  `MarkedForDeletion` = FALSE
                    AND  `DeviceId` = UNHEX('00030000000000000000000000000000')
               )
          );
      

      再次添加这些索引。

      每个 INDEX 中的前几列可以按任何顺序排列,因为它们是使用 = (或<$ c)进行测试的$ c> IS NULL )。最后一两个是范围测试。只有第一个范围用于过滤,但我包含了另一个列,因此索引将是覆盖。

      The first few columns in each INDEX can be in any order, since they are tested with = (or IS NULL). The last one or two are "range" tests. Only the first range will be used for filtering, but I included the other column so that the index would be "covering".

      我的配方可能比索引合并更好。

      My formulations may be better than "index merge".

      这篇关于查询花了将近两秒但只匹配两行 - 为什么索引没有帮助?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆