查询表,其中array_agg /中位数为所有先前位置,LAST_10,LAST_50,当前位置除外 [英] Query table with array_agg/median of ALL previous positions, LAST_10, LAST_50, excluding current position
问题描述
这是我发布的先前:
我有一个数据库表,其中:
I have a database table with:
id | date | position | name
--------------------------------------
1 | 2016-06-29 | 9 | Ben Smith
2 | 2016-06-29 | 1 | Ben Smith
3 | 2016-06-29 | 5 | Ben Smith
4 | 2016-06-29 | 6 | Ben Smith
5 | 2016-06-30 | 2 | Ben Smith
6 | 2016-06-30 | 2 | Tom Brown
7 | 2016-06-29 | 4 | Tom Brown
8 | 2016-06-30 | 2 | Tom Brown
9 | 2016-06-30 | 1 | Tom Brown
如何有效地查询表,以便可以使用array_agg()获得新列。
How can I query the table efficiently so that I can get new columns using array_agg().
我已经尝试了以下查询,但是它的运行速度非常慢,而且也出错,因为它没有按名称列对previous_positions进行分组:
I have already tried the following query however its incredibly slow and also wrong as it doesn't group the previous_positions by the name column:
SELECT runners.id AS runner_id,
btrim(regexp_replace(replace(upper(runners.name::text), '.'::text, ''::text), '[[:digit:]]'::text, ''::text, 'g'::text)) AS name,
runners.position_two,
(array_agg(runners.position_two) OVER w AS results
FROM runners
WINDOW w AS (PARTITION BY (btrim(regexp_replace(replace(upper(runners.name::text), '.'::text, ''::text), '[[:digit:]]'::text, ''::text, 'g'::text))) ORDER BY runners.id ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING);
我希望表输出看起来像这样
I expect the table output to look like this
id | date | position | name | previous | med |med_20
----------------------------------------------------------------------
1 | 2016-06-29 | 9 | Ben Smith | {} | |
2 | 2016-06-29 | 1 | Ben Smith | {9} | 9 | 9
3 | 2016-06-29 | 5 | Ben Smith | {9,1} | 5 | 5
4 | 2016-06-29 | 6 | Ben Smith | {9,1,5} | 5 | 5
5 | 2016-06-30 | 2 | Ben Smith | {9,1,5,6} | 5.5 | 5.5
6 | 2016-06-30 | 2 | Tom Brown | {} | None | None
7 | 2016-06-29 | 4 | Tom Brown | {2} | 2 | 2
8 | 2016-06-30 | 2 | Tom Brown | {2,4} | 3 | 3
9 | 2016-06-30 | 1 | Tom Brown | {2,4,2} | 2 | 2
推荐答案
Postgres没有内置的汇总 MEDIAN
的功能。但是,您可以使用 Postgres Wiki 。此代码段也是 ulib_agg用户定义的库的一部分。
Postgres doesn't have a built-in aggregate function for MEDIAN
. But, you can create one using the function snippet available in Postgres wiki. This snippet is also part of the ulib_agg user-defined library.
创建后,您可以像 SUM
或 STRING_AGG
之类的聚合函数一样使用它具有相似的窗口
规范。 Postgres为您提供了一个选项,可以为用逗号分隔的聚合函数指定多个窗口
定义。
Once it is created you may use it like any aggregate function like SUM
or STRING_AGG
with similar window
specification. Postgres provides you the option to specify multiple window
definitions for aggregate functions separated by a comma.
因此,如果是前20条记录的 MEDIAN
,则可以在此查询中定义您的窗口。
So, to get a MEDIAN
of previous 20 records, your window could be defined as in this query.
SELECT
j.* , array_agg(position) over w as previous_positions,
median(position) over w_20 as med_20
FROM jockeys j
WINDOW w as
( partition by name ORDER BY id rows between
unbounded preceding and 1 preceding
),
w_20 as
( partition by name ORDER BY id rows between
20 preceding and 1 preceding
)
最重要的是,您可以申请 ROUND
函数,如果您想截断十进制数字。
On top of that you may apply ROUND
function if you want to truncate decimal digits.
这篇关于查询表,其中array_agg /中位数为所有先前位置,LAST_10,LAST_50,当前位置除外的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!