使用GROUP BY对特定行进行分组 [英] Group only certain rows with GROUP BY
问题描述
SCHEMA
我在MySQL数据库中设置了以下设置:
CREATE TABLE项目(
序列号,
名称VARCHAR(100),
group_id INT,
价格DECIMAL(10,2),
KEY items_group_id_idx(group_id),
PRIMARY KEY(id)
);
INSERT INTO项目VALUES
(1,'Item A',NULL,10),
(2,'Item B',NULL,20),
(3,'项目C',NULL,30),
(4,'项目D',1,40),
(5,'项目E',2,50),
(6,'项目F',2,60),
(7,'项目G',2,70);
问题
< blockquote>
我需要选择:
- 所有项目
group_id
,其中NULL
值,和group_id
具有最低价格价格。
预期成果
+ ---- + -------- + ---------- + ------- +
| id |名称| group_id |价格|
+ ---- + -------- + ---------- + ------- +
| 1 |项目A | NULL | 10.00 |
| 2 |项目B | NULL | 20.00 |
| 3 |项目C | NULL | 30.00 |
| 4 |项目D | 1 | 40.00 |
| 5 |项目E | 2 | 50.00 |
+ ---- + -------- + ---------- + ------- +
可能的解决方案1:使用 UNION ALL
SELECT id,name,group_id,price FROM items
WHERE group_id IS NULL
UNION ALL
SELECT id ,name,MIN(price)FROM items
WHERE group_id IS NOT NULL
GROUP BY group_id;
/ *解释* /
+ ---- + -------------- + ------------ + ------ + -------------------- + -------------------- + --------- ------- + ------ + + ------------------------- --------------------- +
| id | select_type |表| |键入| possible_keys |键| key_len | ref |行|额外|
+ ---- + -------------- + ------------ + ------ + ----- --------------- + -------------------- + --------- + --- ---- + ------ + -------------------------------------- -------- +
| 1 | PRIMARY |物品| ref | items_group_id_idx | items_group_id_idx | 5 | const | 3 |使用where |
| 2 | UNION |物品| ALL | items_group_id_idx | NULL | NULL | NULL | 7 |在哪里使用;使用临时;使用filesort |
| NULL |联合结果| < union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+ ---- + -------------- + ------------ + ------ + ----- --------------- + -------------------- + --------- + --- ---- + ------ + -------------------------------------- -------- +
然而,有两个查询是不可取的,因为会有更复杂的条件在 解决方案2似乎更快更简单,但我想知道是否有更好的性能方法。 根据 解决方案 WHERE
子句中,我需要对最终结果进行排序。 $ b 可能的解决方案2: GROUP BY
关于表达式(
SELECT id,name,group_id ,MIN(price)FROM items
GROUP BY CASE WHEN group_id IS NOT NULL THEN group_id ELSE RAND()END;
/ *解释* /
+ ---- + ------------- + ------- + ----- - + --------------- + ------ + --------- + ------ + ------ + - -------------------------------- +
| id | select_type |表| |键入| possible_keys |键| key_len | ref |行|额外|
+ ---- + ------------- + ------- + ------ + ----------- ---- + ------ + --------- + ------ + ------ + -------------- ------------------- +
| 1 | SIMPLE |物品| ALL | NULL | NULL | NULL | NULL | 7 |使用临时;使用filesort |
+ ---- + ------------- + ------- + ------ + ----------- ---- + ------ + --------- + ------ + ------ + -------------- ------------------- +
根据此答案 @ axiac ,在兼容性和性能方面更好的解决方案如下所示。
它也在 SQL反模式手册,第15章:不明确的组合。为了提高性能,组合索引也被添加到( group_id,price,id)
。
解决方案
SELECT a.id,a.name,a.group_id,a.price
FROM items a
LEFT JOIN项目b
ON a.group_id = b.group_id
AND(a.price> b.price OR(a.price = b.price和a.id> b.id))
WHERE b.price为NULL;
请参阅 偶然的作为副作用,这个查询在我需要包含 ALL包含 结果 EXPLAIN SCHEMA I have the following set-up in MySQL database: PROBLEM I need to select: EXPECTED RESULTS POSSIBLE SOLUTION 1: Two queries with However it is undesirable to have two queries since there will be more complex condition in POSSIBLE SOLUTION 2: Solution 2 seems to be faster and simple to use but I'm wondering whether there is a better approach in terms of performance. UPDATE: According to documentation referenced by @axiac, this query is illegal in SQL92 and earlier and may work in MySQL only. According to this answer by @axiac, better solution in terms of compatibility and performance is shown below. It is also explained in SQL Antipatterns book, Chapter 15: Ambiguous Groups. To improve performance, combined index is also added for SOLUTION
See explanation on how it works for more details. By accident as a side-effect this query works in my case where I needed to include ALL records with RESULT
EXPLAIN
这篇关于使用GROUP BY对特定行进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
group_id
的记录等于 NULL
AND
$ b
+ ---- + -------- + ---------- + - ------ +
| id |名称| group_id |价格|
+ ---- + -------- + ---------- + ------- +
| 1 |项目A | NULL | 10.00 |
| 2 |项目B | NULL | 20.00 |
| 3 |项目C | NULL | 30.00 |
| 4 |项目D | 1 | 40.00 |
| 5 |项目E | 2 | 50.00 |
+ ---- + -------- + ---------- + ------- +
+ ---- + ------------- + ------- + ------ + - ------------------------------ + ------------------- - + --------- + ---------------------------- + ------ + - ------------------------ +
| id | select_type |表| |键入| possible_keys |键| key_len | ref |行|额外|
+ ---- + ------------- + ------- + ------ + ----------- -------------------- + -------------------- + -------- - + ---------------------------- + ------ + ------------ -------------- +
| 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 7 | |
| 1 | SIMPLE | b | ref | PRIMARY,ID,items_group_id_idx | items_group_id_idx | 5 | agi_development.a.group_id | 1 |在哪里使用;使用index |
+ ---- + ------------- + ------- + ------ + ----------- -------------------- + -------------------- + -------- - + ---------------------------- + ------ + ------------ -------------- +
CREATE TABLE items (
id SERIAL,
name VARCHAR(100),
group_id INT,
price DECIMAL(10,2),
KEY items_group_id_idx (group_id),
PRIMARY KEY (id)
);
INSERT INTO items VALUES
(1, 'Item A', NULL, 10),
(2, 'Item B', NULL, 20),
(3, 'Item C', NULL, 30),
(4, 'Item D', 1, 40),
(5, 'Item E', 2, 50),
(6, 'Item F', 2, 60),
(7, 'Item G', 2, 70);
group_id
that has NULL
value, andgroup_id
having the lowest price.+----+--------+----------+-------+
| id | name | group_id | price |
+----+--------+----------+-------+
| 1 | Item A | NULL | 10.00 |
| 2 | Item B | NULL | 20.00 |
| 3 | Item C | NULL | 30.00 |
| 4 | Item D | 1 | 40.00 |
| 5 | Item E | 2 | 50.00 |
+----+--------+----------+-------+
UNION ALL
SELECT id, name, group_id, price FROM items
WHERE group_id IS NULL
UNION ALL
SELECT id, name, MIN(price) FROM items
WHERE group_id IS NOT NULL
GROUP BY group_id;
/* EXPLAIN */
+----+--------------+------------+------+--------------------+--------------------+---------+-------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+------------+------+--------------------+--------------------+---------+-------+------+----------------------------------------------+
| 1 | PRIMARY | items | ref | items_group_id_idx | items_group_id_idx | 5 | const | 3 | Using where |
| 2 | UNION | items | ALL | items_group_id_idx | NULL | NULL | NULL | 7 | Using where; Using temporary; Using filesort |
| NULL | UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | |
+----+--------------+------------+------+--------------------+--------------------+---------+-------+------+----------------------------------------------+
WHERE
clause and I would need to sort the final results. GROUP BY
on expression (reference)SELECT id, name, group_id, MIN(price) FROM items
GROUP BY CASE WHEN group_id IS NOT NULL THEN group_id ELSE RAND() END;
/* EXPLAIN */
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
| 1 | SIMPLE | items | ALL | NULL | NULL | NULL | NULL | 7 | Using temporary; Using filesort |
+----+-------------+-------+------+---------------+------+---------+------+------+---------------------------------+
(group_id, price, id)
.
SELECT a.id, a.name, a.group_id, a.price
FROM items a
LEFT JOIN items b
ON a.group_id = b.group_id
AND (a.price > b.price OR (a.price = b.price and a.id > b.id))
WHERE b.price is NULL;
group_id
equals to NULL
AND one item from each group with the lowest price.
+----+--------+----------+-------+
| id | name | group_id | price |
+----+--------+----------+-------+
| 1 | Item A | NULL | 10.00 |
| 2 | Item B | NULL | 20.00 |
| 3 | Item C | NULL | 30.00 |
| 4 | Item D | 1 | 40.00 |
| 5 | Item E | 2 | 50.00 |
+----+--------+----------+-------+
+----+-------------+-------+------+-------------------------------+--------------------+---------+----------------------------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------------------+--------------------+---------+----------------------------+------+--------------------------+
| 1 | SIMPLE | a | ALL | NULL | NULL | NULL | NULL | 7 | |
| 1 | SIMPLE | b | ref | PRIMARY,id,items_group_id_idx | items_group_id_idx | 5 | agi_development.a.group_id | 1 | Using where; Using index |
+----+-------------+-------+------+-------------------------------+--------------------+---------+----------------------------+------+--------------------------+