我应该如何为具有两个范围条件的查询建立索引? [英] How should I go about indexing for a query with two range conditions?
问题描述
系列中的下一个…
Next in the series…
CREATE TABLE `Alarms` (
`AlarmId` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`DeviceId` BINARY(16) NOT NULL,
`Code` BIGINT(20) UNSIGNED NOT NULL,
`Ended` TINYINT(1) NOT NULL DEFAULT '0',
`NaturalEnd` TINYINT(1) NOT NULL DEFAULT '0',
`Pinned` TINYINT(1) NOT NULL DEFAULT '0',
`Acknowledged` TINYINT(1) NOT NULL DEFAULT '0',
`StartedAt` TIMESTAMP NOT NULL DEFAULT '0000-00-00 00:00:00',
`EndedAt` TIMESTAMP NULL DEFAULT NULL,
`MarkedForDeletion` TINYINT(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`AlarmId`),
KEY `Key1` (`Ended`,`Acknowledged`),
KEY `Key2` (`Pinned`),
KEY `Key3` (`DeviceId`,`Pinned`),
KEY `Key4` (`DeviceId`,`StartedAt`,`EndedAt`),
KEY `Key5` (`DeviceId`,`Ended`,`EndedAt`),
KEY `Key6` (`MarkedForDeletion`,`DeviceId`,`Acknowledged`,`Ended`,`StartedAt`),
KEY `Key7` (`MarkedForDeletion`,`DeviceId`,`Ended`,`Pinned`,`EndedAt`)
) ENGINE=INNODB;
当选择时间戳以匹配许多行时,此查询非常慢:
This query is quite slow when the timestamps are chosen such that many rows are matched:
SELECT (((UNIX_TIMESTAMP(`StartedAt`)) DIV 900) * 900) AS `Period_TS`,
COUNT(`AlarmId`) AS `n`
FROM `Alarms`
WHERE `StartedAt` >= FROM_UNIXTIME(1518990000)
AND `StartedAt` < FROM_UNIXTIME(1518998400)
AND `DeviceId` IN (
UNHEX('00030000000000000000000000000000'),
UNHEX('000300000000000000000000000181cd'),
UNHEX('000300000000000000000000000e7cf6'),
UNHEX('000300000000000000000000000e7cf7'),
UNHEX('000300000000000000000000000f423f')
) AND `MarkedForDeletion` = FALSE
GROUP BY `Period_TS` ASC;
我认为这是因为我在两个字段上混合范围条件( DeviceId
和 StartedAt
)。
I believe that this because I am mixing range conditions on two fields (DeviceId
and StartedAt
).
如果这是真的,我该怎么做才能解决问题?也许是触发使用索引合并的东西?
If that's true, what can I do to resolve the problem? Perhaps something to trigger use of an index merge?
推荐答案
IN
介于 =
和'范围'之间。所以,我对问题的标题嗤之以鼻。两个范围几乎不可能优化;一个 IN
加上一个范围有一定的优化机会。
IN
is sort of between =
and a 'range'. So, I quibble with the title on the Question. Two ranges is virtually impossible to optimize; an IN
plus a range has some chance of optimization.
基于
WHERE `StartedAt` >= FROM_UNIXTIME(1518990000)
AND `StartedAt` < FROM_UNIXTIME(1518998400)
AND `DeviceId` IN (
UNHEX('00030000000000000000000000000000'),
UNHEX('000300000000000000000000000181cd'),
UNHEX('000300000000000000000000000e7cf6'),
UNHEX('000300000000000000000000000e7cf7'),
UNHEX('000300000000000000000000000f423f')
) AND `MarkedForDeletion` = FALSE
我会提供2个索引并让Optimizer决定使用哪个:
I would provide 2 indexes and let the Optimizer decide which to use:
INDEX(MarkedForDeletion, StartedAt, DeviceId)
INDEX(MarkedForDeletion, DeviceId, StartedAt)
一些较新版本的MySQL / MariaDB可以跳过并使用 second 索引中的所有3列。在所有版本中,任一索引的前两列都使其成为候选者。选择可能由统计数据驱动,并且可能(或可能不)是正确的选择。
Some newer versions of MySQL/MariaDB can leapfrog and make use of all 3 columns in the second index. In all versions the first 2 columns of either index makes it a candidate. The choice may be driven by statistics, and may (or may not) be the 'right' choice.
因为 AlarmId
不能 NULL
,使用模式: COUNT(*)
。
Since AlarmId
cannot be NULL
, use the pattern: COUNT(*)
.
进行更改后,我的每个索引都是覆盖,从而进一步提升了性能。
After making that change, each of my indexes is "covering", thereby giving an extra boost in performance.
这篇关于我应该如何为具有两个范围条件的查询建立索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!