分组和计数 [英] Grouping and counting
问题描述
我有一个这样的数据集-
I have a data set like this --
**Team Date W/L**
Team_1 04/01/0012 W
Team_1 06/01/0012 W
Team_1 07/01/0012 L
Team_1 14/01/0012 W
Team_1 19/01/0012 W
Team_1 30/01/0012 L
Team_1 14/02/0012 W
Team_1 17/02/0012 L
Team_1 20/02/0012 W
Team_2 01/01/0012 W
Team_2 05/01/0012 W
Team_2 09/01/0012 W
Team_2 13/01/0012 L
Team_2 18/01/0012 W
Team_2 25/01/0012 L
Team_2 05/02/0012 L
Team_2 13/02/0012 L
Team_2 19/02/0012 L
Team_3 02/01/0012 W
Team_3 02/01/0012 W
Team_3 06/01/0012 W
Team_3 10/01/0012 W
Team_3 19/01/0012 W
Team_3 31/01/0012 L
Team_3 11/02/0012 W
Team_3 15/02/0012 L
Team_3 21/02/0012 W
因此,我需要找出谁拥有最大的连续胜利-
And from this I need to find out who had the biggest consecutive wins --
团队计数
Team_3 5
Team_2 3
Team_1 2
只允许我编写sql查询.我该怎么写?
I am allowed to write only sql queries. How can I write this?
推荐答案
您可以使用以下内容:
SELECT Team, TotalWins, FirstWin, LastWin
FROM ( SELECT Team,
WL,
COUNT(*) TotalWins,
MIN("Date") FirstWin,
MAX("Date") LastWin,
ROW_NUMBER() OVER(PARTITION BY Team, WL ORDER BY COUNT(*) DESC) RowNumber
FROM ( SELECT Team,
"Date",
WL,
ROW_NUMBER() OVER(PARTITION BY Team ORDER BY "Date") - ROW_NUMBER() OVER(PARTITION BY Team, WL ORDER BY "Date") Grouping
FROM T
) GroupedData
WHERE WL = 'W'
GROUP BY Team, WL, Grouping
) RankedData
WHERE RowNumber = 1;
它使用ROW_NUMBER对按团队划分的每个游戏进行排名,并根据结果对每组连续结果而言,两者之间的差异是唯一的.因此,对于您的第一支球队,您将:
It uses ROW_NUMBER to rank each game partitioned by team, and also by result, the difference between these two is unique for each group of consecutive results. So for your first team you would have:
Team Date W/L RN1 RN2 DIFF
Team_1 04/01/0012 W 1 1 0
Team_1 06/01/0012 W 2 2 0
Team_1 07/01/0012 L 3 1 2
Team_1 14/01/0012 W 4 3 1
Team_1 19/01/0012 W 5 4 1
Team_1 30/01/0012 L 6 2 4
Team_1 14/02/0012 W 7 5 2
Team_1 17/02/0012 L 8 3 5
Team_1 20/02/0012 W 9 6 3
其中RN1仅按组划分,而rn2按组和结果划分.
Where RN1 is just partitioned by team, and rn2 is partition by team and result.
如您所见,如果您删除损失,那么对于每组连续的胜利,DIFF列将增加1:
As you can see, if You remove the Losses then the DIFF column increments by one for each group of consecutive victories:
Team Date W/L RN1 RN2 DIFF
Team_1 04/01/0012 W 1 1 0
Team_1 06/01/0012 W 2 2 0
---------------------------------------
Team_1 14/01/0012 W 4 3 1
Team_1 19/01/0012 W 5 4 1
---------------------------------------
Team_1 14/02/0012 W 7 5 2
---------------------------------------
Team_1 20/02/0012 W 9 6 3
然后您可以按此分组以确保您正在寻找连续的获胜者,并进行计数以获得最大收益.然后,我只是使用另一个行号来获得每个团队的最大连续获胜次数.
You can then group by this to ensure you are looking at consecutive wins, and do a count to get the most. I've then just used another rownumber to get the maximum consecutive wins per team.
Example on SQL Fiddle
这篇关于分组和计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!