如何对每列执行相同的聚合,而不列出列? [英] How to perform the same aggregation on every column, without listing the columns?
问题描述
我有一个包含 N 列的表格。让我们称它们为 c1
, c2
, c3
, c4
,... cN
。在多行中,我想为<1,N]中的每个X获取 COUNT DISTINCT(cX)
的单行。
I have a table with N columns. Let's call them c1
, c2
, c3
, c4
, ... cN
. Among multiple rows, I want to get a single row with COUNT DISTINCT(cX)
for each X in [1, N].
c1 | c2 | ... | cn
0 | 4 | ... | 1
有一种方法可以做到这一点(在存储过程中)
Is there a way I can do this (in a stored procedure) without writing every column name into the query manually?
我们遇到了一个问题,其中应用服务器中的错误意味着我们重写后面插入垃圾的好的列值。为了解决这个问题,我存储信息日志结构,其中每行代表一个逻辑 UPDATE
查询。
We've had a problem where bugs in application servers mean we rewrite good column values with garbage inserted later. To solve this, I'm storing the information log-structure, where each row represents a logical UPDATE
query. Then, when given a signal that the record is complete, I can determine if any values were (erroneously) overwritten.
多行中单个正确记录的示例:
An example of a single correct record in multiple rows: there is at most one value for each column.
| id | initialize_time | start_time | end_time |
| 1 | 12:00am | NULL | NULL |
| 1 | 12:00am | 1:00pm | NULL |
| 1 | 12:00am | NULL | 2:00pm |
Reconciled row:
| 1 | 12:00am | 1:00pm | 2:00pm |
我想要检测的不可调和记录的示例:
An example of an irreconcilable record that I want to detect:
| id | initialize_time | start_time | end_time |
| 1 | 12:00am | NULL | NULL |
| 1 | 12:00am | 1:00pm | NULL |
| 1 | 9:00am | 1:00pm | 2:00pm | -- New initialize time => irreconcilable!
推荐答案
您需要动态SQL 因此,这意味着您必须创建一个函数或运行 DO
命令。由于您无法直接从后者返回值,因此 plpgsql函数是:
You need dynamic SQL for that, which means you have to create a function or run a DO
command. Since you cannot return values directly from the latter, a plpgsql function it is:
CREATE OR REPLACE function f_count_all(_tbl text
, OUT columns text[], OUT counts bigint[])
RETURNS record LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE (
SELECT 'SELECT
ARRAY[' || string_agg('''' || quote_ident(attname) || '''', ', ') || '],
ARRAY[' || string_agg('count(' || quote_ident(attname) || ')', ', ') || ']
FROM ' || _tbl
FROM pg_attribute
WHERE attrelid = _tbl::regclass
AND attnum >= 1 -- exclude tableoid & friends (neg. attnum)
AND attisdropped is FALSE -- exclude deleted columns
GROUP BY attrelid
)
INTO columns, counts;
END
$func$;
呼叫:
SELECT * FROM f_count_all('myschema.mytable');
返回:
columns | counts
--------------+--------
{c1, c2, c3,} | {17 1,0}
有关动态SQL和 EXECUTE的更多说明和链接
在此相关问题中,或在SO,尝试此系列。
More explanation and links about dynamic SQL and EXECUTE
in this related question - or a couple more here on SO, try this serach.
与此问题非常相似:
postgresql - count(无空值)
您甚至可以尝试并返回多态记录类型来动态获取单列,但是这相当复杂和高级。可能太多的努力为你的情况。 此相关答案中的更多内容。
You could even try and return a polymorphic record type to get single columns dynamically, but that's rather complex and advanced. Probably too much effort for your case. More in this related answer.
这篇关于如何对每列执行相同的聚合,而不列出列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!