选择每列的最后一个值,每个用户,最简单的方法 [英] Select the last value of each column, per user, the simplest way

查看:49
本文介绍了选择每列的最后一个值,每个用户,最简单的方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是相关上一个问题的扩展版本.我已经发布了一个新问题 Erwin Brandstetter 建议我这样做.(在人们回答了我的第一个问题后,我意识到我实际上想要这个)

This is an extended version of a related previous question. I have posted it a new question for Erwin Brandstetter suggested me to do so. (I realized that I actually wanted this, after people replied to my first question)

有以下数据(空白表示NULL):

Having the following data (blank means NULL):

ID    User  ColA    ColB    ColC
1     1     15              20
2     1     11      4       
3     1             3
4     2     5       5       10
5     2     6 
6     2             8
7     1             1

如何以最简单的方式获取所有用户每列的最后一个非空值?所以给定数据的结果是:

How can I get the last not-NULL values of each column for all users, the simplest way? So the resulting for the given data would be:

User  ColA    ColB    ColC
1     11      1       20
2     6       8       10

我没有找到太多,似乎做类似于我描述的事情的函数是 COALESCE,但在我的情况下它没有按预期工作.

I have not found much, the function that seemed to do something similar to what I describe was COALESCE, but it does not work as expected in my case.

注意:尽可能使用标准 SQL,否则使用 PostgreSQL.涉及的列的数量可能会发生变化,因此最好采用与这三个特定列无关的解决方案.

Note: Standard SQL if possible, PostgreSQL otherwise. The count of the involved columns might change, so a solution that is not tied to these three specific columns would be best.

推荐答案

标准"SQL

类似于我在上一个问题中发布的内容,递归 CTE 很优雅,可能是在标准 SQL 中执行此操作的最快方法 - 特别是对于每个用户的多行.

"Standard" SQL

Similar to what I posted on the previous question, a recursive CTE is elegant and probably the fastest way to do it in standard SQL - especially for many rows per user.

WITH RECURSIVE t AS (
   SELECT row_number() OVER (PARTITION BY usr ORDER  BY id DESC) AS rn
         ,usr, cola, colb, colc
   FROM   tbl
   )

   , x AS (
   SELECT rn, usr, cola, colb, colc
   FROM   t
   WHERE  rn = 1

   UNION ALL
   SELECT t.rn, t.usr
        , COALESCE(x.cola, t.cola)
        , COALESCE(x.colb, t.colb)
        , COALESCE(x.colc, t.colc)
   FROM   x
   JOIN   t USING (usr)
   WHERE  t.rn = x.rn + 1
   AND    (x.cola IS NULL OR x.colb IS NULL OR x.colc IS NULL)
   )
SELECT DISTINCT ON (usr)
       usr, cola, colb, colc
FROM   x
ORDER  BY usr, rn DESC;

->用于请求的 PostgreSQL 的 sqlfiddle.

唯一的非标准元素是DISTINCT ON,它是标准中DISTINCT的扩展.将最后的 SELECT 替换为标准 SQL:

The only non-standard element is DISTINCT ON, which is an extension to DISTINCT in the standard. Replace the final SELECT with this for a standard SQL:

SELECT usr
      ,max(cola) As cola
      ,max(colb) As colb
      ,max(colc) As colc
FROM   x
GROUP  BY usr
ORDER  BY usr;

对标准 SQL"的请求用途有限.该标准仅存在于纸面上.没有 RDBMS 实现 100% 的标准 SQL - 这也毫无意义,因为标准在这里和那里都包含了无意义的部分.可以说,PostgreSQL 的实现是最接近标准的.

The request for "standard SQL" is of limited use. The standard only exists on paper. No RDBMS implements 100 % standard SQL - it would be rather pointless, too, since the standard includes nonsensical parts here and there. Arguably, PostgreSQL's implementation is among the closest to the standard.

此解决方案特定于 PostgreSQL,但性能应该非常好.

This solution is specific to PostgreSQL, but should perform very well.

我在上面的小提琴中演示的同一张桌子上建造.

I am building on the same table as demonstrated in the fiddle above.

CREATE OR REPLACE FUNCTION f_last_nonull_per_user()
RETURNS SETOF tbl AS
$func$
DECLARE
   _row tbl;  -- table name can be used as row type
   _new tbl;
BEGIN

FOR _new IN
   SELECT * FROM tbl ORDER BY usr, id DESC
LOOP
   IF _new.usr = _row.usr THEN 
      _row.id := _new.id;   -- copy only id
      IF _row.cola IS NULL AND _new.cola IS NOT NULL THEN
         _row.cola := _new.cola; END IF;   -- only if no value found yet
      IF _row.colb IS NULL AND _new.colb IS NOT NULL THEN
         _row.colb := _new.colb; END IF;
      IF _row.colc IS NULL AND _new.colc IS NOT NULL THEN
         _row.colc := _new.colc; END IF;
   ELSE
      IF _new.usr <> _row.usr THEN  -- doesn't fire on first row
         RETURN NEXT _row;
      END IF;   
      _row := _new;  -- remember row for next iteration
   END IF;
END LOOP;

RETURN NEXT _row;  -- return row for last usr

END
$func$ LANGUAGE plpgsql;

调用:

SELECT * FROM f_last_nonull_per_user();

返回整行 - 包括我们需要填充所有列的最小 id.

Returns the whole row - including the min id we need to fill all columns.

这篇关于选择每列的最后一个值,每个用户,最简单的方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆