分组和计数 [英] Grouping and counting

查看:104
本文介绍了分组和计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的数据集-

I have a data set like this --

**Team  Date    W/L**
Team_1  04/01/0012  W
Team_1  06/01/0012  W
Team_1  07/01/0012  L
Team_1  14/01/0012  W
Team_1  19/01/0012  W
Team_1  30/01/0012  L
Team_1  14/02/0012  W
Team_1  17/02/0012  L
Team_1  20/02/0012  W
Team_2  01/01/0012  W
Team_2  05/01/0012  W
Team_2  09/01/0012  W
Team_2  13/01/0012  L
Team_2  18/01/0012  W
Team_2  25/01/0012  L
Team_2  05/02/0012  L
Team_2  13/02/0012  L
Team_2  19/02/0012  L
Team_3  02/01/0012  W
Team_3  02/01/0012  W
Team_3  06/01/0012  W
Team_3  10/01/0012  W
Team_3  19/01/0012  W
Team_3  31/01/0012  L
Team_3  11/02/0012  W
Team_3  15/02/0012  L
Team_3  21/02/0012  W

因此,我需要找出谁拥有最大的连续胜利-

And from this I need to find out who had the biggest consecutive wins --

团队计数

Team_3  5
Team_2  3
Team_1  2

只允许我编写sql查询.我该怎么写?

I am allowed to write only sql queries. How can I write this?

推荐答案

您可以使用以下内容:

SELECT  Team, TotalWins, FirstWin, LastWin
FROM    (   SELECT  Team, 
                    WL,
                    COUNT(*) TotalWins,
                    MIN("Date") FirstWin,
                    MAX("Date") LastWin,
                    ROW_NUMBER() OVER(PARTITION BY Team, WL ORDER BY COUNT(*) DESC) RowNumber
            FROM    (   SELECT  Team,
                                "Date",
                                WL, 
                                ROW_NUMBER() OVER(PARTITION BY Team ORDER BY "Date") - ROW_NUMBER() OVER(PARTITION BY Team, WL ORDER BY "Date") Grouping
                        FROM    T
                    ) GroupedData
            WHERE   WL = 'W'
            GROUP BY Team, WL, Grouping
        ) RankedData
WHERE   RowNumber = 1;

它使用ROW_NUMBER对按团队划分的每个游戏进行排名,并根据结果对每组连续结果而言,两者之间的差异是唯一的.因此,对于您的第一支球队,您将:

It uses ROW_NUMBER to rank each game partitioned by team, and also by result, the difference between these two is unique for each group of consecutive results. So for your first team you would have:

Team    Date        W/L RN1     RN2 DIFF
Team_1  04/01/0012  W   1       1   0
Team_1  06/01/0012  W   2       2   0
Team_1  07/01/0012  L   3       1   2
Team_1  14/01/0012  W   4       3   1
Team_1  19/01/0012  W   5       4   1
Team_1  30/01/0012  L   6       2   4
Team_1  14/02/0012  W   7       5   2
Team_1  17/02/0012  L   8       3   5
Team_1  20/02/0012  W   9       6   3

其中RN1仅按组划分,而rn2按组和结果划分.

Where RN1 is just partitioned by team, and rn2 is partition by team and result.

如您所见,如果您删除损失,那么对于每组连续的胜利,DIFF列将增加1:

As you can see, if You remove the Losses then the DIFF column increments by one for each group of consecutive victories:

Team    Date        W/L RN1     RN2 DIFF
Team_1  04/01/0012  W   1       1   0
Team_1  06/01/0012  W   2       2   0
---------------------------------------
Team_1  14/01/0012  W   4       3   1
Team_1  19/01/0012  W   5       4   1
---------------------------------------
Team_1  14/02/0012  W   7       5   2
---------------------------------------
Team_1  20/02/0012  W   9       6   3

然后您可以按此分组以确保您正在寻找连续的获胜者,并进行计数以获得最大收益.然后,我只是使用另一个行号来获得每个团队的最大连续获胜次数.

You can then group by this to ensure you are looking at consecutive wins, and do a count to get the most. I've then just used another rownumber to get the maximum consecutive wins per team.

关于SQL Fiddle的示例

Example on SQL Fiddle

这篇关于分组和计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆