使用MAX()的GROUP BY返回错误的行ID [英] GROUP BY with MAX() return wrong id of the rows
问题描述
我希望在执行请求时获得每周每个order_product_id的最大使用容量.WHERE子句中的JOIN或SELECT变体不起作用,因为对某些order_product_id重复了max_capacity.我的查询每周返回正确的order_product_id和max_capacity,但没有返回正确的行ID.
I want to get the maximum used capacity for each order_product_id for each week when executing a request. The JOIN or SELECT variant in the WHERE clause does not work because max_capacity is repeated for some order_product_id's. My query returns the correct order_product_id and max_capacity for each week, but does not return the correct row ID.
CREATE TABLE `capacity_log` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`date_occurred` DATETIME NOT NULL,
`ip_address` VARCHAR(255) NOT NULL DEFAULT '',
`order_product_id` INT UNSIGNED NOT NULL,
`serial` VARCHAR(255) NOT NULL DEFAULT '',
`used_capacity` BIGINT NULL DEFAULT NULL,
`aux2` INT NULL DEFAULT NULL,
`request` BLOB NULL,
`retry_count` INT NOT NULL DEFAULT '0',
`fetch_time` INT NOT NULL DEFAULT '0',
`response` BLOB NULL,
`custom_fetch_time` INT NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
INDEX `user_id` (`order_product_id`))
我的查询:
SELECT c.order_product_id, MAX(c.used_capacity) AS `max_capacity`
FROM capacity_log c
WHERE c.date_occurred < '2020-10-1' AND c.aux2 IS NULL
GROUP BY
YEAR(c.date_occurred), WEEK(c.date_occurred),
c.order_product_id
推荐答案
您需要整行,因此聚合不是您想要的.相反,您需要过滤.一个选项使用子查询:
You want entire rows, so aggregation is not what you are after. Instead, you need to filter. One option uses a subquery:
select c.*
from capacity_log c
where c.id = (
select c1.id
from capacity_log c1
where
c1.date_occurred < '2020-10-1'
and c1.aux2 is null
and c1.order_product_id = c.order_product_id
and yearweek(c1.date_occurred) = yearweek(c.date_occurred)
order by c1.used_capacity desc limit 1
)
我们可以像这样优化子查询的 where
子句:
We can optimize the where
clause of the subquery like so:
where
c1.date_occurred < '2020-10-1'
and c1.aux2 is null
and c1.order_product_id = c.order_product_id
and c1.date_occurred >= c.date_occurred - interval weekday(c.date_occurred) day
and c1.date_occurred < c.date_occurred - interval weekday(c.date_occurred) day + interval 7 day
为了提高性能,您需要在(order_product_id,aux2,date_occurred,used_capacity,id)
上建立索引.
For performance, you want an index on (order_product_id, aux2, date_occurred, used_capacity, id)
.
这篇关于使用MAX()的GROUP BY返回错误的行ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!