选择每列的最后一个值,每个用户,最简单的方法 [英] Select the last value of each column, per user, the simplest way
问题描述
这是相关上一个问题的扩展版本.我已经发布了一个新问题 Erwin Brandstetter 建议我这样做.(在人们回答了我的第一个问题后,我意识到我实际上想要这个)
This is an extended version of a related previous question. I have posted it a new question for Erwin Brandstetter suggested me to do so. (I realized that I actually wanted this, after people replied to my first question)
有以下数据(空白表示NULL):
Having the following data (blank means NULL):
ID User ColA ColB ColC
1 1 15 20
2 1 11 4
3 1 3
4 2 5 5 10
5 2 6
6 2 8
7 1 1
如何以最简单的方式获取所有用户每列的最后一个非空值?所以给定数据的结果是:
How can I get the last not-NULL values of each column for all users, the simplest way? So the resulting for the given data would be:
User ColA ColB ColC
1 11 1 20
2 6 8 10
我没有找到太多,似乎做类似于我描述的事情的函数是 COALESCE
,但在我的情况下它没有按预期工作.
I have not found much, the function that seemed to do something similar to what I describe was COALESCE
, but it does not work as expected in my case.
注意:尽可能使用标准 SQL,否则使用 PostgreSQL.涉及的列的数量可能会发生变化,因此最好采用与这三个特定列无关的解决方案.
Note: Standard SQL if possible, PostgreSQL otherwise. The count of the involved columns might change, so a solution that is not tied to these three specific columns would be best.
推荐答案
标准"SQL
类似于我在上一个问题中发布的内容,递归 CTE 很优雅,可能是在标准 SQL 中执行此操作的最快方法 - 特别是对于每个用户的多行.
"Standard" SQL
Similar to what I posted on the previous question, a recursive CTE is elegant and probably the fastest way to do it in standard SQL - especially for many rows per user.
WITH RECURSIVE t AS (
SELECT row_number() OVER (PARTITION BY usr ORDER BY id DESC) AS rn
,usr, cola, colb, colc
FROM tbl
)
, x AS (
SELECT rn, usr, cola, colb, colc
FROM t
WHERE rn = 1
UNION ALL
SELECT t.rn, t.usr
, COALESCE(x.cola, t.cola)
, COALESCE(x.colb, t.colb)
, COALESCE(x.colc, t.colc)
FROM x
JOIN t USING (usr)
WHERE t.rn = x.rn + 1
AND (x.cola IS NULL OR x.colb IS NULL OR x.colc IS NULL)
)
SELECT DISTINCT ON (usr)
usr, cola, colb, colc
FROM x
ORDER BY usr, rn DESC;
->用于请求的 PostgreSQL 的 sqlfiddle.
唯一的非标准元素是DISTINCT ON
,它是标准中DISTINCT
的扩展.将最后的 SELECT
替换为标准 SQL:
The only non-standard element is DISTINCT ON
, which is an extension to DISTINCT
in the standard. Replace the final SELECT
with this for a standard SQL:
SELECT usr
,max(cola) As cola
,max(colb) As colb
,max(colc) As colc
FROM x
GROUP BY usr
ORDER BY usr;
对标准 SQL"的请求用途有限.该标准仅存在于纸面上.没有 RDBMS 实现 100% 的标准 SQL - 这也毫无意义,因为标准在这里和那里都包含了无意义的部分.可以说,PostgreSQL 的实现是最接近标准的.
The request for "standard SQL" is of limited use. The standard only exists on paper. No RDBMS implements 100 % standard SQL - it would be rather pointless, too, since the standard includes nonsensical parts here and there. Arguably, PostgreSQL's implementation is among the closest to the standard.
此解决方案特定于 PostgreSQL,但性能应该非常好.
This solution is specific to PostgreSQL, but should perform very well.
我在上面的小提琴中演示的同一张桌子上建造.
I am building on the same table as demonstrated in the fiddle above.
CREATE OR REPLACE FUNCTION f_last_nonull_per_user()
RETURNS SETOF tbl AS
$func$
DECLARE
_row tbl; -- table name can be used as row type
_new tbl;
BEGIN
FOR _new IN
SELECT * FROM tbl ORDER BY usr, id DESC
LOOP
IF _new.usr = _row.usr THEN
_row.id := _new.id; -- copy only id
IF _row.cola IS NULL AND _new.cola IS NOT NULL THEN
_row.cola := _new.cola; END IF; -- only if no value found yet
IF _row.colb IS NULL AND _new.colb IS NOT NULL THEN
_row.colb := _new.colb; END IF;
IF _row.colc IS NULL AND _new.colc IS NOT NULL THEN
_row.colc := _new.colc; END IF;
ELSE
IF _new.usr <> _row.usr THEN -- doesn't fire on first row
RETURN NEXT _row;
END IF;
_row := _new; -- remember row for next iteration
END IF;
END LOOP;
RETURN NEXT _row; -- return row for last usr
END
$func$ LANGUAGE plpgsql;
调用:
SELECT * FROM f_last_nonull_per_user();
返回整行 - 包括我们需要填充所有列的最小 id
.
Returns the whole row - including the min id
we need to fill all columns.
这篇关于选择每列的最后一个值,每个用户,最简单的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!