获得每个组的前1行 [英] Get top 1 row of each group

查看:96
本文介绍了获得每个组的前1行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张表格,我想为每个组获得最新的条目。下面是表格:



DocumentStatusLogs 表格

  | ID | DocumentID |状态| DateCreated | 
| 2 | 1 | S1 | 7/29/2011 |
| 3 | 1 | S2 | 7/30/2011 |
| 6 | 1 | S1 | 8/02/2011 |
| 1 | 2 | S1 | 7/28/2011 |
| 4 | 2 | S2 | 7/30/2011 |
| 5 | 2 | S3 | 8/01/2011 |
| 6 | 3 | S1 | 8/02/2011 |

表格将按 DocumentID 分组,按 DateCreated 降序排序。对于每个 DocumentID ,我想获得最新状态。



我的首选输出:

  | DocumentID |状态| DateCreated | 
| 1 | S1 | 8/02/2011 |
| 2 | S3 | 8/01/2011 |
| 3 | S1 | 8/02/2011 |




  • 是否有任何聚合函数仅获取顶部每组?请参阅下面的伪代码 GetOnlyTheTop

      SELECT 
    DocumentID ,
    GetOnlyTheTop(状态),
    GetOnlyTheTop(DateCreated)
    FROM DocumentStatusLogs
    GROUP BY文件ID
    ORDER BY DateCreated DESC
  • b $ b
  • 首先,这可能是由非规范化数据库引起的吗?我在想,因为我要找的只是一行,那么 status 也应该位于父表中吗?



请参阅父表获取更多信息:



当前文档

  | DocumentID |标题|内容| DateCreated | 
| 1 | TitleA | ... | ... |
| 2 | TitleB | ... | ... |
| 3 | TitleC | ... | ... |

父表应该如此以便我可以轻松访问它的状态吗?

  | DocumentID |标题|内容| DateCreated | CurrentStatus | 
| 1 | TitleA | ... | ... | s1 |
| 2 | TitleB | ... | ... | s3 |
| 3 | TitleC | ... | ... | s1 |

更新
我刚学会如何使用apply这使得解决这些问题变得更容易。

解决方案

 ; WITH cte AS 

SELECT *,
ROW_NUMBER()OVER(PARTITION BY DocumentID ORDER BY DateCreated DESC)AS
FROM DocumentStatusLogs

SELECT *
FROM cte
WHERE rn = 1

如果您希望每天有2个条目,那么这将是任意的选一个。要获得一天的两个条目,请使用DENSE_RANK而不是

至于规范化与否,取决于您是否想要:


  • 保留2个地点的状态

  • 保存状态记录 $ b


现在,您可以保留状态历史记录。如果你想在父表中获得最新的状态(这是非规范化),你需要一个触发器来维护父状态。或者放弃这个状态历史记录表。


I have a table which I want to get the latest entry for each group. Here's the table:

DocumentStatusLogs Table

|ID| DocumentID | Status | DateCreated |
| 2| 1          | S1     | 7/29/2011   |
| 3| 1          | S2     | 7/30/2011   |
| 6| 1          | S1     | 8/02/2011   |
| 1| 2          | S1     | 7/28/2011   |
| 4| 2          | S2     | 7/30/2011   |
| 5| 2          | S3     | 8/01/2011   |
| 6| 3          | S1     | 8/02/2011   |

The table will be grouped by DocumentID and sorted by DateCreated in descending order. For each DocumentID, I want to get the latest status.

My preferred output:

| DocumentID | Status | DateCreated |
| 1          | S1     | 8/02/2011   |
| 2          | S3     | 8/01/2011   |
| 3          | S1     | 8/02/2011   |

  • Is there any aggregate function to get only the top from each group? See pseudo-code GetOnlyTheTop below:

    SELECT
      DocumentID,
      GetOnlyTheTop(Status),
      GetOnlyTheTop(DateCreated)
    FROM DocumentStatusLogs
    GROUP BY DocumentID
    ORDER BY DateCreated DESC
    

  • If such function doesn't exist, is there any way I can achieve the output I want?

  • Or at the first place, could this be caused by unnormalized database? I'm thinking, since what I'm looking for is just one row, should that status also be located in the parent table?

Please see the parent table for more information:

Current Documents Table

| DocumentID | Title  | Content  | DateCreated |
| 1          | TitleA | ...      | ...         |
| 2          | TitleB | ...      | ...         |
| 3          | TitleC | ...      | ...         |

Should the parent table be like this so that I can easily access its status?

| DocumentID | Title  | Content  | DateCreated | CurrentStatus |
| 1          | TitleA | ...      | ...         | s1            |
| 2          | TitleB | ...      | ...         | s3            |
| 3          | TitleC | ...      | ...         | s1            |

UPDATE I just learned how to use "apply" which makes it easier to address such problems.

解决方案

;WITH cte AS
(
   SELECT *,
         ROW_NUMBER() OVER (PARTITION BY DocumentID ORDER BY DateCreated DESC) AS rn
   FROM DocumentStatusLogs
)
SELECT *
FROM cte
WHERE rn = 1

If you expect 2 entries per day, then this will arbitrarily pick one. To get both entries for a day, use DENSE_RANK instead

As for normalised or not, it depends if you want to:

  • maintain status in 2 places
  • preserve status history
  • ...

As it stands, you preserve status history. If you want latest status in the parent table too (which is denormalisation) you'd need a trigger to maintain "status" in the parent. or drop this status history table.

这篇关于获得每个组的前1行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆