选择每组的第一行 [英] Selecting first row per group
问题描述
我有一个包含三列的表格:A,B,C
.
I have a table with three columns: A,B,C
.
值为:
+---+-----+----+
| A | B | C |
+---+-----+----+
| 1 | -10 | 5 |
| 1 | 0 | 5 |
| 1 | 10 | 5 |
| 2 | 10 | 12 |
| 2 | 0 | 12 |
| 3 | -10 | 14 |
| 4 | 0 | 8 |
| 4 | 10 | 8 |
| 5 | 0 | 6 |
| 5 | 1 | 6 |
| 5 | -5 | 6 |
+---+-----+----+
如果我首先按 A
列对数据进行排序,然后是 B
列,然后是 C
列(尽管我确实使所有列 >C
每列的值相同 A
值)我将如何选择每列 A
的第一行"?
If I first order the data by column A
, then column B
, then column C
(although I did make all column C
values the same per column A
value) how would I select the "first row" per column A
?
所以,这应该导致:
+---+-----+----+
| A | B | C |
+---+-----+----+
| 1 | -10 | 5 |
| 2 | 0 | 12 |
| 3 | -10 | 14 |
| 4 | 0 | 8 |
| 5 | -5 | 6 |
+---+-----+----+
推荐答案
SELECT a, b, c
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY a ORDER BY b, c) rn
FROM mytable
) q
WHERE rn = 1
ORDER BY
a
或
SELECT mi.*
FROM (
SELECT DISTINCT a
FROM mytable
) md
CROSS APPLY
(
SELECT TOP 1 *
FROM mytable mi
WHERE mi.a = md.a
ORDER BY
b, c
) mi
ORDER BY
a
在 (a, b, c)
上创建一个复合索引,以便查询更快地工作.
Create a composite index on (a, b, c)
for the queries to work faster.
哪个更有效取决于您的数据分布.
Which one is more efficient depends on your data distribution.
如果 a
的不同值很少,但每个 a
中有很多记录,那么第二个查询会更好.
If you have few distinct values of a
but lots of records within each a
, the second query would be better.
您可以通过创建索引视图进一步改进它:
You could improve it even more by creating an indexed view:
CREATE VIEW v_mytable_da
WITH SCHEMABINDING
AS
SELECT a, COUNT_BIG(*) cnt
FROM dbo.mytable
GROUP BY
a
GO
CREATE UNIQUE CLUSTERED INDEX
pk_vmytableda_a
ON v_mytable_da (a)
GO
SELECT mi.*
FROM v_mytable_da md
CROSS APPLY
(
SELECT TOP 1 *
FROM mytable mi
WHERE mi.a = md.a
ORDER BY
b, c
) mi
ORDER BY
a
这篇关于选择每组的第一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!