执行动态交叉表查询 [英] Execute a dynamic crosstab query

查看:102
本文介绍了执行动态交叉表查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的Postgres数据库中实现了此功能: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/

功能如下:

create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
    dynsql1 varchar;
    dynsql2 varchar;
    columnlist varchar;
begin
    -- 1. retrieve list of column names.
    dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
    execute dynsql1 into columnlist;
    -- 2. set up the crosstab query
    dynsql2 = 'select * from crosstab (
 ''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
 ''select distinct '||colc||' from '||tablename||' order by 1''
 )
 as ct (
 '||rowc||' varchar,'||columnlist||'
 );';
    return dynsql2;
end
$$;

所以现在我可以调用该函数了:

So now I can call the function:

select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');

哪个返回(因为该函数的返回类型为varchar):

Which returns (because the return type of the function is varchar):

select * from crosstab (
   'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
    from globalpayments
    group by 1,2
    order by 1,2'
 , 'select distinct currency
    from globalpayments
    order by 1'
   )  as ct ( month varchar,CAD text,EUR text,GBP text,USD text );

如何获取此函数以不仅为动态交叉表生成代码,还执行结果?即,当我手动复制/粘贴/执行时的结果是这样的.但我希望它无需执行额外的步骤即可执行:该函数应组装动态查询 并执行它:

How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:

编辑1

此函数很接近,但我需要它返回的不仅仅是第一条记录的第一列

This function comes close, but I need it to return more than just the first column of the first record

来自:用法:select * from eval($$select * from analytics limit 1$$)

但是它只返回第一条记录的第一列:

However it just returns the first column of the first record :

eval
----
2015

实际结果如下:

Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000

推荐答案

您要的是 不可能 . SQL是一种严格类型化的语言. PostgreSQL函数需要在创建 时间声明返回类型(RETURNS ..).

解决此问题的一种有限方法是使用多态函数.如果您可以在函数调用 时提供返回类型.但这从您的问题中看不出来.

A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.

可以返回带有匿名记录的完全动态的结果.但是随后您需要在每次调用时提供一个列定义列表.您如何知道返回的列?赶上22..

You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.

有多种解决方法,具体取决于您需要或可以使用的方法.由于您的所有数据列似乎都共享相同的数据类型,因此我建议返回一个 array :text[].或者,您可以返回hstorejson之类的文档类型.相关:

There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:

将hstore键动态转换为一组未知键集的列

但是使用两个调用可能会更简单:1:让Postgres构建查询. 2:执行并检索返回的行.

But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.

我完全不会使用您的问题 中提出的Eric Minikel的功能.通过恶意格式错误的标识符来防止SQL注入是不安全的.使用 format() 进行构建查询字符串,除非您运行的版本早于Postgres 9.1.

I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.

更短,更简洁的实现可能看起来像这样:

A shorter and cleaner implementation could look like this:

CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
                              , _expr text  -- still vulnerable to SQL injection!
                              , _type regtype)
  RETURNS text AS
$func$
DECLARE
   _cat_list text;
   _col_list text;
BEGIN

-- generate categories for xtab param and col definition list    
EXECUTE format(
 $$SELECT string_agg(quote_literal(x.cat), '), (')
        , string_agg(quote_ident  (x.cat), %L)
   FROM  (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
 , ' ' || _type || ', ', _cat, _tbl)
INTO  _cat_list, _col_list;

-- generate query string
RETURN format(
'SELECT * FROM crosstab(
   $q$SELECT %I, %I, %s
      FROM   %I
      GROUP  BY 1, 2  -- only works if the 3rd column is an aggregate expression
      ORDER  BY 1, 2$q$
 , $c$VALUES (%5$s)$c$
   ) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr  -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type
);

END
$func$ LANGUAGE plpgsql;

与原始版本相同的函数调用.功能crosstab()由附加模块 tablefunc .基础:

Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:

这将安全地处理列名和表名.注意使用对象标识符类型regclassregtype.也适用于模式限定的名称.

This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.

但是,当您传递要作为表达式执行的字符串(原始查询中的_expr-cellc)时, 并不完全安全 .这种输入本质上是不安全的,不能进行SQL注入,因此永远不要向公众公开.

However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.

仅一次 扫描表中的两个类别列表,并且速度应更快一些.

Scans the table only once for both lists of categories and should be a bit faster.

仍然无法返回完全动态的行类型,因为那是绝对不可能的.

Still can't return completely dynamic row types since that's strictly not possible.

这篇关于执行动态交叉表查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆