组内的MySQL排序 [英] within group sorts in mysql
问题描述
我有一个面板数据集:即times
,ids
和values
.我想根据每个日期的值进行排名.我可以通过运行以下命令非常简单地实现排序:
I have a panel data set: that is, times
, ids
, and values
. I would like to do a ranking based on value for each date. I can achieve the sort very simply by running:
select * from tbl order by date, value
我遇到的问题是,以这种方式对表格进行排序后,如何检索每个组的行号(也就是说,对于每个日期,我希望有一个列从1到N的排名).
The issue I have is once the table is sorted in this way, how do I retrieve the row number of each group (that is, for each date I would like there to be a column called ranking that goes from 1 to N).
示例:
输入:
Date, ID, Value
d1, id1, 2
d1, id2, 1
d2, id1, 10
d2, id2, 11
输出:
Date, ID, Value, Rank
d1, id2, 1, 1
d1, id1, 2, 2
d2, id1, 10, 1
d2, id2, 11, 2
推荐答案
缺少窗口函数,您可以订购tbl
并使用用户变量自己计算分区(日期"值)的等级:
Absent window functions, you can order tbl
and use user variables to compute rank over your partitions ("date" values) yourself:
SELECT "date", -- D) Desired columns
id,
value,
rank
FROM (SELECT "date", -- C) Rank by date
id,
value,
CASE COALESCE(@partition, "date")
WHEN "date" THEN @rank := @rank + 1
ELSE @rank := 1
END AS rank,
@partition := "date" AS dummy
FROM (SELECT @rank := 0 AS rank, -- A) User var init
@partition := NULL AS partition) dummy
STRAIGHT_JOIN
( SELECT "date", -- B) Ordering query
id,
value
FROM tbl
ORDER BY date, value) tbl_ordered;
更新
那么,该查询在做什么?
Update
So, what is that query doing?
我们正在使用用户变量来循环"经过排序的结果集,根据结果集的哪个连续段(在@partition
中跟踪)来递增或重置计数器(@rank
) )我们进入了.
We are using user variables to "loop" through a sorted result set, incrementing or resetting a counter (@rank
) depending upon which contiguous segment of the result set (tracked in @partition
) we're in.
在查询 A 中,我们初始化了两个用户变量.在查询 B 中,我们以所需的顺序获取表的记录:首先按日期,然后按值. A 和 B 共同组成一个派生表tbl_ordered
,看起来像这样:
In query A we initialize two user variables. In query B we get the records of your table in the order we need: first by date and then by value. A and B together make a derived table, tbl_ordered
, that looks something like this:
rank | partition | "date" | id | value
---- + --------- + ------ + ---- + -----
0 | NULL | d1 | id2 | 1
0 | NULL | d1 | id1 | 2
0 | NULL | d2 | id1 | 10
0 | NULL | d2 | id2 | 11
请记住,我们并不真正在意列dummy.rank
和dummy.partition
-它们只是我们初始化变量@rank
和@partition
的偶然原因.
Remember, we don't really care about the columns dummy.rank
and dummy.partition
— they're just accidents of how we initialize the variables @rank
and @partition
.
在查询 C 中,我们遍历派生表的记录.我们正在做的或多或少是以下伪代码所做的事情:
In query C we loop through the derived table's records. What we're doing is more-or-less what the following pseudocode does:
rank = 0
partition = nil
foreach row in fetch_rows(sorted_query):
(date, id, value) = row
if partition is nil or partition == date:
rank += 1
else:
rank = 1
partition = date
stdout.write(date, id, value, rank, partition)
最后,查询 D 从 C 除了的所有列投影到保存@partition
的列(我们将其命名为dummy
并执行不需要显示).
Finally, query D projects all columns from C except for the column holding @partition
(which we named dummy
and do not need to display).
这篇关于组内的MySQL排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!