如何对每列执行相同的聚合,而不列出列? [英] How to perform the same aggregation on every column, without listing the columns?

查看:115
本文介绍了如何对每列执行相同的聚合,而不列出列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 N 列的表格。让我们称它们为 c1 c2 c3 c4 ,... cN 。在多行中,我想为<1,N]中的每个X获取 COUNT DISTINCT(cX)的单行。

I have a table with N columns. Let's call them c1, c2, c3, c4, ... cN. Among multiple rows, I want to get a single row with COUNT DISTINCT(cX) for each X in [1, N].

c1 | c2 | ... | cn
0  | 4  | ... | 1

有一种方法可以做到这一点(在存储过程中)

Is there a way I can do this (in a stored procedure) without writing every column name into the query manually?

我们遇到了一个问题,其中应用服务器中的错误意味着我们重写后面插入垃圾的好的列值。为了解决这个问题,我存储信息日志结构,其中每行代表一个逻辑 UPDATE 查询。

We've had a problem where bugs in application servers mean we rewrite good column values with garbage inserted later. To solve this, I'm storing the information log-structure, where each row represents a logical UPDATE query. Then, when given a signal that the record is complete, I can determine if any values were (erroneously) overwritten.

多行中单个正确记录的示例:

An example of a single correct record in multiple rows: there is at most one value for each column.

| id | initialize_time | start_time | end_time |
| 1  | 12:00am         | NULL       | NULL     |
| 1  | 12:00am         | 1:00pm     | NULL     |
| 1  | 12:00am         | NULL       | 2:00pm   |

Reconciled row:
| 1  | 12:00am         | 1:00pm     | 2:00pm   |

我想要检测的不可调和记录的示例:

An example of an irreconcilable record that I want to detect:

| id | initialize_time | start_time | end_time |
| 1  | 12:00am         | NULL       | NULL     |
| 1  | 12:00am         | 1:00pm     | NULL     |
| 1  | 9:00am          | 1:00pm     | 2:00pm   |   -- New initialize time => irreconcilable!


推荐答案

您需要动态SQL 因此,这意味着您必须创建一个函数或运行 DO 命令。由于您无法直接从后者返回值,因此 plpgsql函数是:

You need dynamic SQL for that, which means you have to create a function or run a DO command. Since you cannot return values directly from the latter, a plpgsql function it is:

CREATE OR REPLACE function f_count_all(_tbl text
                           , OUT columns text[], OUT counts bigint[])
  RETURNS record LANGUAGE plpgsql AS
$func$
BEGIN

EXECUTE (
    SELECT 'SELECT
     ARRAY[' || string_agg('''' || quote_ident(attname) || '''', ', ') || '], 
     ARRAY[' || string_agg('count(' || quote_ident(attname) || ')', ', ') || ']
    FROM ' || _tbl
    FROM   pg_attribute
    WHERE  attrelid = _tbl::regclass
    AND    attnum  >= 1           -- exclude tableoid & friends (neg. attnum)
    AND    attisdropped is FALSE  -- exclude deleted columns
    GROUP  BY attrelid
    )
INTO columns, counts;

END
$func$;

呼叫:

SELECT * FROM f_count_all('myschema.mytable');

返回:

columns       | counts
--------------+--------
{c1, c2, c3,} | {17 1,0}

有关动态SQL和 EXECUTE的更多说明和链接此相关问题中,或在SO,尝试此系列

More explanation and links about dynamic SQL and EXECUTE in this related question - or a couple more here on SO, try this serach.

与此问题非常相似:

postgresql - count(无空值)

您甚至可以尝试并返回多态记录类型来动态获取单列,但是这相当复杂和高级。可能太多的努力为你的情况。 此相关答案中的更多内容。

You could even try and return a polymorphic record type to get single columns dynamically, but that's rather complex and advanced. Probably too much effort for your case. More in this related answer.

这篇关于如何对每列执行相同的聚合,而不列出列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆