什么索引可以提高JOIN和GROUP BY的性能 [英] What indexes to improve performance of JOIN and GROUP BY
问题描述
我已经设置了一些表并运行了一个查询.但是在我的解释中,它将在临时表中生成SQL结果(我认为这是由于GROUP BY引起的)
I have setup some tables and ran a query. However in my explain it would appear the SQL results in a temporary table being generated ( I assume this is because of the GROUP BY)
我添加了一些索引来加快查询的速度,但是想知道是否有一种方法可以停止使用临时表,是否还有其他方法可以加快使用索引的查询速度?
I have added some indexes to speed up the query but wondering if there was a way to stop the use of a temporary table and if there is any other way I can speed my query up using indexes?
CartData
CREATE TABLE `cartdata` (
`IDCartData` INT(11) NOT NULL AUTO_INCREMENT,
`CartOrderref` VARCHAR(25) NOT NULL DEFAULT '',
`UserID` INT(11) NOT NULL DEFAULT '0',
`LastUpdate` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE
CURRENT_TIMESTAMP,
`ManualContactName` VARCHAR(100) NOT NULL DEFAULT '',
`ManualOrderConfirmationEmail` VARCHAR(100) NOT NULL DEFAULT '',
PRIMARY KEY (`IDCartData`),
INDEX `CartOrderref` (`CartOrderref`)
)
CartSplitData
CREATE TABLE `cartsplitdata` (
`IDCartSupplierData` INT(11) NOT NULL AUTO_INCREMENT,
`IDCartData` INT(11) NOT NULL DEFAULT '0',
`supplierid` INT(11) NOT NULL DEFAULT '0',
`DeliveryDate` DATE NOT NULL DEFAULT '2000-01-01',
`AccountNumber` VARCHAR(50) NOT NULL DEFAULT '',
`ManualOrderref` VARCHAR(50) NOT NULL DEFAULT '',
`lastupdate` DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`IDCartSupplierData`),
INDEX `cartdatasupplierid` (`IDCartData`, `supplierid`)
)
我的示例查询
EXPLAIN SELECT max(CartData.idCartDATA) AS idCartDATA , CartData.*, CartSplitData.*
FROM CartData
JOIN CartSplitData ON CartSplitData.IDCartDATA = CartDATA.IDCartData
WHERE CartData.CartOrderref = 'XXXXXXXXX'
group by CartSplitData.SUPPLIERID
说明查询结果
推荐答案
专业提示避免使用 SELECT *
或 SELECT表.*
敏感查询.而是按名称选择您实际需要使用的列
Pro tip Avoid SELECT *
or SELECT table.*
in performance-sensitive queries. Instead select, by name, the columns you actually need to use
专业提示:MySQL对您正在使用的 GROUP BY
有一个臭名昭著的非标准扩展,并且可能会被滥用.读这个. https://dev.mysql.com/doc/refman/8.0/en/group-by-handling.html 如果您遵循第一个提示,那么遵循第二个提示会容易得多.
Pro tip MySQL has a notorious nonstandard extension to GROUP BY
which you are using, and possibly misusing. Read this. https://dev.mysql.com/doc/refman/8.0/en/group-by-handling.html If you followed the first pro tip, following the second would be much easier.
专业提示,避免抛出"大量单列索引,以期加快查询速度.相反,创建索引(通常是复合索引)以匹配实际查询的需求.阅读此 https://use-the-index-luke.com .
Pro tip Avoid "throwing in" lots of single column indexes in hopes of accelerating your queries. Instead, create indexes, often compound indexes, matching the needs of your actual query. Read this https://use-the-index-luke.com .
专业提示 使用临时功能;使用出现在EXPLAIN输出中的filesort
不一定是不好的.这仅表示查询引擎必须在返回部分结果集之前对其进行缓存. temporary
东西不是一个实际的表,它是一个RAM结构.如果它太大而导致RAM泛滥,MySQL会将其溢出到磁盘上.但是你不是.
Pro tip Using temporary; using filesort
appearing in EXPLAIN output is not necessarily bad. It simply means that the query engine has to cache a partial result set before returning it. The temporary
thing isn't an actual table, it's a RAM structure. If it is so big it swamps RAM, MySQL will spill it to disk. But yours isn't.
所有这些,让我们重构您的查询.我猜想您要为每个 CartSplitData.SUPPLIERID
检索具有最大 idCartDATA
值的行.
All that being said, let's refactor your query. I guess you want to retrieve the rows with the largest idCartDATA
value for each CartSplitData.SUPPLIERID
.
因此,我们将其写为子查询.
So let's write that as a subquery.
SELECT max(IDCartDATA) AS IDCartDATA, SUPPLIERID
FROM CartSplitData
GROUP BY SUPPLIERID
通过将复合索引放在CartSplitData上,可以大大加快该查询的速度:(SUPPLIERID,IDCartDATA)
.
This query can be sped up, dramatically, by putting a compound index on CartSplitData: (SUPPLIERID, IDCartDATA)
.
接下来,让我们重写您的主查询,以找到与该子查询中的ID相匹配的行.
Next, let's rewrite your main query to find the rows matching the ids in that subquery.
SELECT CartData.* /* * hammers performance */
CartSplitData.* /* * hammers performance */
FROM CartData
JOIN CartSplitData ON CartSplitData.IDCartDATA = CartDATA.IDCartData
JOIN (
SELECT max(IDCartDATA) AS IDCartDATA, SUPPLIERID
FROM CartSplitData
GROUP BY SUPPLIERID
)x ON x.SUPPLIERID = CartSplitData.SUPPLIERID
AND x.IDCartData = CartSplitData.IDCartData
WHERE CartData.CartOrderref = 'XXXXXXXXX'
您在 CartData.CartOrderref
上的索引以及该复合索引创建的^^^都将帮助此外部查询.
Your index on CartData.CartOrderref
will help this outer query, as will the compound index created ^^^.
这篇关于什么索引可以提高JOIN和GROUP BY的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!