从表中选择一组动态的列,并获取每个列的总和 [英] Select a dynamic set of columns from a table and get the sum for each

查看:120
本文介绍了从表中选择一组动态的列,并获取每个列的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以在Postgres中执行以下操作:

Is it possible to do the following in Postgres:

SELECT column_name FROM information_schema WHERE table_name = 'somereport' AND data_type = 'integer';

SELECT SUM(coulmn_name[0]),SUM(coulmn_name[1]) ,SUM(coulmn_name[3]) FROM somereport;

换句话说,我需要根据特定条件从表中选择一组列,然后将表中的每个列求和.

In other words I need to select a group of columns from a table depending on certain criteria, and then sum each of those columns in the table.

我知道我可以循环执行此操作,因此我可以独立地对每一列进行计数,但是显然,这需要对信息模式查询返回的每一列进行查询.例如:

I know I can do this in a loop, so I can count each column independently, but obviously that requires a query for each column returned from the information schema query. Eg:

FOR r IN select column_name from information_schema where report_view_name = 'somereport' and data_type = 'integer';
LOOP
    SELECT SUM(r.column_name) FROM somereport;
END

推荐答案

此查询将创建您要使用的完整DML语句:

This query creates the complete DML statement you are after:

WITH x AS (
   SELECT 'public'::text     AS _schema  -- provide schema name ..
         ,'somereport'::text AS _tbl     -- .. and table name once
   )
SELECT 'SELECT ' || string_agg('sum(' || quote_ident(column_name)
                 || ') AS sum_' || quote_ident(column_name), ', ')
       || E'\nFROM   ' || quote_ident(x._schema) || '.' || quote_ident(x._tbl)
FROM   x, information_schema.columns
WHERE  table_schema = _schema
AND    table_name = _tbl
AND    data_type = 'integer'
GROUP  BY x._schema, x._tbl;

您可以单独执行它,也可以将此查询包装在plpgsql函数中,然后使用EXECUTE自动运行查询:

You can execute it separately or wrap this query in a plpgsql function and run the query automatically with EXECUTE:

经过PostgreSQL 9.1.4测试

Tested with PostgreSQL 9.1.4

CREATE OR REPLACE FUNCTION f_get_sums(_schema text, _tbl text)
  RETURNS TABLE(names text[], sums bigint[]) AS
$BODY$
BEGIN

RETURN QUERY EXECUTE (
    SELECT 'SELECT ''{'
           || string_agg(quote_ident(c.column_name), ', ' ORDER BY c.column_name)
           || '}''::text[],
           ARRAY['
           || string_agg('sum(' || quote_ident(c.column_name) || ')'
                                                   , ', ' ORDER BY c.column_name)
           || ']
    FROM   '
           || quote_ident(_schema) || '.' || quote_ident(_tbl)
    FROM   information_schema.columns c
    WHERE  table_schema = _schema
    AND    table_name = _tbl
    AND    data_type = 'integer'
    );

END;
$BODY$
  LANGUAGE plpgsql;

致电:

SELECT unnest(names) AS name, unnest (sums) AS col_sum
FROM   f_get_sums('public', 'somereport');

返回:

   name        | col_sum
---------------+---------
 int_col1      |    6614
 other_int_col |    8364
 third_int_col | 2720642

解释

困难在于为函数定义RETURN类型,而返回的列数和名称将有所不同.一个细节有所帮助:您只需要integer列.

Explain

The difficulty is to define the RETURN type for the function, while number and names of columns returned will vary. One detail that helps a little: you only want integer columns.

我通过形成bigint的数组解决了这个问题(sum(int_col)返回bigint).另外,我返回一个列名数组.两者均按列名按字母顺序排序.

I solved this by forming an array of bigint (sum(int_col) returns bigint). In addition I return an array of column names. Both sorted alphabetically by column name.

在函数调用中,我使用

In the function call I split up these arrays with unnest() arriving at the handsome format displayed.

动态创建和执行的查询是高级内容.不要被多层引号弄糊涂了.基本上,您有EXECUTE,它接受包含SQL查询的文本参数来执行.反过来,此文本由辅助SQL查询提供,该辅助SQL查询构建了主查询的查询字符串.

The dynamically created and executed query is advanced stuff. Don't get confused by multiple layers of quotes. Basically you have EXECUTE that takes a text argument containing the SQL query to execute. This text, in turn, is provided by secondary SQL query that builds the query string of the primary query.

如果这一次太多了,或者plpgsql对于您来说还是很新的,请从此相关答案开始,我解释了处理简单得多的功能的基础知识,并提供了主要功能手册的链接.

If this is too much at once or plpgsql is rather new for you, start with this related answer where I explain the basics dealing with a much simpler function and provide links to the manual for the major features.

如果性能是必不可少的,则直接查询Postgres目录(pg_catalog.pg_attributes),而不是使用标准化的(但速度较慢)information_schema.columns.这是一个带有pg_attributes 的简单示例.

If performance is essential query the Postgres catalog directly (pg_catalog.pg_attributes) instead of using the standardized (but slow) information_schema.columns. Here is a simple example with pg_attributes.

这篇关于从表中选择一组动态的列,并获取每个列的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆