找到每位玩家最长的完美成绩连胜纪录 [英] Find the longest streak of perfect scores per player
问题描述
我在PostgreSQL数据库中使用 ORDER BY player_id ASC,时间ASC
的 SELECT
查询得到以下结果:
player_id点时间395 0 2018-06-01 17:55:23.982413-04395100 2018-06-30 11:05:21.8679-04395 0 2018-07-15 21:56:25.420837-04395100 2018-07-28 19:47:13.84652-04395 0 2018-11-27 17:09:59.384-05395100 2018-12-02 08:56:06.83033-05399 0 2018-05-15 15:28:22.782945-04399100 2018-06-10 12:11:18.041521-04454 0 2018-07-10 18:53:24.236363-04675 0 2018-08-07 20:59:15.510936-04696 0 2018-08-07 19:09:07.126876-04756100 2018-08-15 08:21:11.300871-04756 100 2018-08-15 16:43:08.698862-04756 0 2018-08-15 17:22:49.755721-04756100 2018-10-07 15:30:49.27374-04756 0 2018-10-07 15:35:00.975252-04756 0 2018-11-27 19:04:06.456982-05756 100 2018-12-02 19:24:20.880022-05756100 2018-12-04 19:57:48.961111-05
我试图找到每个玩家最长的连胜纪录,其中 points = 100
,并且决胜局以最近一次连胜纪录为准.我还需要确定该球员最长连胜的开始时间.预期结果将是:
player_id longest_streak time_began395 1 2018-12-02 08:56:06.83033-05399 1 2018-06-10 12:11:18.041521-04756 2 2018-12-02 19:24:20.880022-05
一个差距和岛屿的问题.
假设:
- 条纹"不会被其他玩家的行打断.
- 所有列均定义为
NOT NULL
.(否则您需要做更多.)
这应该最简单,最快,因为它只需要两个快速 row_number()
窗口函数:
SELECT DISTINCT ON(player_id)player_id,count(*)AS seq_len,min(ts)AS time_began从 (SELECT player_id,积分,ts,row_number()OVER(PARTITION BY player_id ORDER BY ts)-row_number()OVER(PARTITION BY player_id,点ORDER BY ts)AS grp来自tbl)子WHERE点= 100GROUP BY player_id,grp-在WHERE点= 100之后省略点"ORDER BY player_id,seq_len DESC,time_begin DESC;
db<>小提琴此处 >
使用列名 ts
代替 time
,这是(player_id,点)的同一组( grp
)中.然后用100点过滤那些,按组汇总,并且仅返回每个玩家最长,最新的结果.
该技术的基本说明:
我们可以在同一 SELECT
中使用 GROUP BY
和 DISTINCT ON
,并应用 GROUP BY >之前 DISTINCT ON
.考虑 SELECT
查询中的事件顺序:
关于 DISTINCT ON
:
I have a the following result from a SELECT
query with ORDER BY player_id ASC, time ASC
in PostgreSQL database:
player_id points time
395 0 2018-06-01 17:55:23.982413-04
395 100 2018-06-30 11:05:21.8679-04
395 0 2018-07-15 21:56:25.420837-04
395 100 2018-07-28 19:47:13.84652-04
395 0 2018-11-27 17:09:59.384-05
395 100 2018-12-02 08:56:06.83033-05
399 0 2018-05-15 15:28:22.782945-04
399 100 2018-06-10 12:11:18.041521-04
454 0 2018-07-10 18:53:24.236363-04
675 0 2018-08-07 20:59:15.510936-04
696 0 2018-08-07 19:09:07.126876-04
756 100 2018-08-15 08:21:11.300871-04
756 100 2018-08-15 16:43:08.698862-04
756 0 2018-08-15 17:22:49.755721-04
756 100 2018-10-07 15:30:49.27374-04
756 0 2018-10-07 15:35:00.975252-04
756 0 2018-11-27 19:04:06.456982-05
756 100 2018-12-02 19:24:20.880022-05
756 100 2018-12-04 19:57:48.961111-05
I'm trying to find each player's longest streak where points = 100
, with the tiebreaker being whichever streak began most recently. I also need to determine the time at which that player's longest streak began. The expected result would be:
player_id longest_streak time_began
395 1 2018-12-02 08:56:06.83033-05
399 1 2018-06-10 12:11:18.041521-04
756 2 2018-12-02 19:24:20.880022-05
A gaps-and-islands problem indeed.
Assuming:
- "Streaks" are not interrupted by rows from other players.
- All columns are defined
NOT NULL
. (Else you have to do more.)
This should be simplest and fastest as it only needs two fast row_number()
window functions:
SELECT DISTINCT ON (player_id)
player_id, count(*) AS seq_len, min(ts) AS time_began
FROM (
SELECT player_id, points, ts
, row_number() OVER (PARTITION BY player_id ORDER BY ts)
- row_number() OVER (PARTITION BY player_id, points ORDER BY ts) AS grp
FROM tbl
) sub
WHERE points = 100
GROUP BY player_id, grp -- omit "points" after WHERE points = 100
ORDER BY player_id, seq_len DESC, time_began DESC;
db<>fiddle here
Using the column name ts
instead of time
, which is a reserved word in standard SQL. It's allowed in Postgres, but with limitations and it's still a bad idea to use it as identifier.
The "trick" is to subtract row numbers so that consecutive rows fall in the same group (grp
) per (player_id, points)
. Then filter the ones with 100 points, aggregate per group and return only the longest, most recent result per player.
Basic explanation for the technique:
We can use GROUP BY
and DISTINCT ON
in the same SELECT
, GROUP BY
is applied before DISTINCT ON
. Consider the sequence of events in a SELECT
query:
About DISTINCT ON
:
这篇关于找到每位玩家最长的完美成绩连胜纪录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!