组内的MySQL排序 [英] within group sorts in mysql

查看:136
本文介绍了组内的MySQL排序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个面板数据集:即timesidsvalues.我想根据每个日期的值进行排名.我可以通过运行以下命令非常简单地实现排序:

I have a panel data set: that is, times, ids, and values. I would like to do a ranking based on value for each date. I can achieve the sort very simply by running:

select * from tbl order by date, value

我遇到的问题是,以这种方式对表格进行排序后,如何检索每个组的行号(也就是说,对于每个日期,我希望有一个列从1到N的排名).

The issue I have is once the table is sorted in this way, how do I retrieve the row number of each group (that is, for each date I would like there to be a column called ranking that goes from 1 to N).

示例:

输入:

Date, ID, Value
 d1, id1, 2
 d1, id2, 1
 d2, id1, 10
 d2, id2, 11

输出:

Date, ID, Value, Rank
 d1, id2, 1, 1
 d1, id1, 2, 2
 d2, id1, 10, 1
 d2, id2, 11, 2

推荐答案

缺少窗口函数,您可以订购tbl并使用用户变量自己计算分区(日期"值)的等级:

Absent window functions, you can order tbl and use user variables to compute rank over your partitions ("date" values) yourself:

SELECT "date",                                                -- D) Desired columns
       id,
       value,
       rank
  FROM (SELECT "date",                                        -- C) Rank by date
               id,
               value,
               CASE COALESCE(@partition, "date")
                 WHEN "date" THEN @rank := @rank + 1
                 ELSE             @rank := 1
               END AS rank,
               @partition := "date" AS dummy
          FROM (SELECT @rank := 0 AS rank,                    -- A) User var init
                       @partition := NULL AS partition) dummy
               STRAIGHT_JOIN
               (  SELECT "date",                              -- B) Ordering query
                         id,
                         value
                    FROM tbl
                ORDER BY date, value) tbl_ordered;

更新

那么,该查询在做什么?

Update

So, what is that query doing?

我们正在使用用户变量来循环"经过排序的结果集,根据结果集的哪个连续段(在@partition中跟踪)来递增或重置计数器(@rank) )我们进入了.

We are using user variables to "loop" through a sorted result set, incrementing or resetting a counter (@rank) depending upon which contiguous segment of the result set (tracked in @partition) we're in.

在查询 A 中,我们初始化了两个用户变量.在查询 B 中,我们以所需的顺序获取表的记录:首先按日期,然后按值. A B 共同组成一个派生表tbl_ordered,看起来像这样:

In query A we initialize two user variables. In query B we get the records of your table in the order we need: first by date and then by value. A and B together make a derived table, tbl_ordered, that looks something like this:

rank | partition | "date" |  id  | value 
---- + --------- + ------ + ---- + -----
  0  |   NULL    |   d1   |  id2 |    1
  0  |   NULL    |   d1   |  id1 |    2
  0  |   NULL    |   d2   |  id1 |   10
  0  |   NULL    |   d2   |  id2 |   11

请记住,我们并不真正在意列dummy.rankdummy.partition-它们只是我们初始化变量@rank@partition的偶然原因.

Remember, we don't really care about the columns dummy.rank and dummy.partition — they're just accidents of how we initialize the variables @rank and @partition.

在查询 C 中,我们遍历派生表的记录.我们正在做的或多或少是以下伪代码所做的事情:

In query C we loop through the derived table's records. What we're doing is more-or-less what the following pseudocode does:

rank      = 0
partition = nil

foreach row in fetch_rows(sorted_query):
  (date, id, value) = row

  if partition is nil or partition == date:
    rank += 1
  else:
    rank = 1

  partition = date

  stdout.write(date, id, value, rank, partition)

最后,查询 D C 除了的所有列投影到保存@partition的列(我们将其命名为dummy并执行不需要显示).

Finally, query D projects all columns from C except for the column holding @partition (which we named dummy and do not need to display).

这篇关于组内的MySQL排序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆