执行动态交叉表查询 [英] Execute a dynamic crosstab query
问题描述
我在我的Postgres数据库中实现了此功能: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
功能如下:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
所以现在我可以调用该函数了:
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
哪个返回(因为该函数的返回类型为varchar):
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
如何获取此函数以不仅为动态交叉表生成代码,还执行结果?即,当我手动复制/粘贴/执行时的结果是这样的.但我希望它无需执行额外的步骤即可执行:该函数应组装动态查询 并执行它:
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
编辑1
此函数很接近,但我需要它返回的不仅仅是第一条记录的第一列
This function comes close, but I need it to return more than just the first column of the first record
来自:用法:select * from eval($$select * from analytics limit 1$$)
但是它只返回第一条记录的第一列:
However it just returns the first column of the first record :
eval
----
2015
实际结果如下:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
推荐答案
您要的是 不可能 . SQL是一种严格类型化的语言. PostgreSQL函数需要在创建 的时间声明返回类型(RETURNS ..
).
解决此问题的一种有限方法是使用多态函数.如果您可以在函数调用 的时提供返回类型.但这从您的问题中看不出来.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
您可以返回带有匿名记录的完全动态的结果.但是随后您需要在每次调用时提供一个列定义列表.您如何知道返回的列?赶上22..
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
有多种解决方法,具体取决于您需要或可以使用的方法.由于您的所有数据列似乎都共享相同的数据类型,因此我建议返回一个 array :text[]
.或者,您可以返回hstore
或json
之类的文档类型.相关:
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]
. Or you could return a document type like hstore
or json
. Related:
但是使用两个调用可能会更简单:1:让Postgres构建查询. 2:执行并检索返回的行.
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
我完全不会使用您的问题 中提出的Eric Minikel的功能.通过恶意格式错误的标识符来防止SQL注入是不安全的.使用 format()
进行构建查询字符串,除非您运行的版本早于Postgres 9.1.
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format()
to build query strings unless you are running an outdated version older than Postgres 9.1.
更短,更简洁的实现可能看起来像这样:
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type
);
END
$func$ LANGUAGE plpgsql;
与原始版本相同的函数调用.功能crosstab()
由附加模块 tablefunc
.基础:
Same function call as your original version. The function crosstab()
is provided by the additional module tablefunc
which has to be installed. Basics:
这将安全地处理列名和表名.注意使用对象标识符类型regclass
和regtype
.也适用于模式限定的名称.
This handles column and table names safely. Note the use of object identifier types regclass
and regtype
. Also works for schema-qualified names.
但是,当您传递要作为表达式执行的字符串(原始查询中的_expr
-cellc
)时, 并不完全安全 .这种输入本质上是不安全的,不能进行SQL注入,因此永远不要向公众公开.
However, it is not completely safe while you pass a string to be executed as expression (_expr
- cellc
in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
仅一次 扫描表中的两个类别列表,并且速度应更快一些.
Scans the table only once for both lists of categories and should be a bit faster.
仍然无法返回完全动态的行类型,因为那是绝对不可能的.
Still can't return completely dynamic row types since that's strictly not possible.
这篇关于执行动态交叉表查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!