具有多个“行名”的Postgresql交叉表查询列 [英] Postgresql crosstab query with multiple "row name" columns

查看:93
本文介绍了具有多个“行名”的Postgresql交叉表查询列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表,它是一个高瘦事实表:

I have a table that is a "tall skinny" fact table:

CREATE TABLE facts(
    eff_date timestamp NOT NULL,
    update_date timestamp NOT NULL,
    symbol_id int4 NOT NULL,
    data_type_id int4 NOT NULL,
    source_id char(3) NOT NULL,
    fact decimal
 /* Keys */
  CONSTRAINT fact_pk
    PRIMARY KEY (source_id, symbol_id, data_type_id, eff_date),
)

我想将其透视为报告,因此标题如下:

I'd like to "pivot" this for a report, so the header looks like this:

eff_date, symbol_id, source_id, datatypeValue1, ... DatatypeValueN

Ie ,对于每一个eff_date,symbol_id和source_id的唯一组合,我都希望一行。

I.e., I'd like a row for each unique combination of eff_date, symbol_id, and source_id.

但是,postgresql crosstab()函数仅允许在键列上使用。

However, the postgresql crosstab() function only allow on key column.

有什么想法吗?

推荐答案

crosstab()期望其输入查询中包含以下列(第一个参数) ,顺序如下:

crosstab() expects the following columns from its input query (1st parameter), in this order:


  1. a 行名

  2. (可选)额外

  3. a 类别(匹配值在第二个交叉表参数中)

  4. a

  1. a row_name
  2. (optional) extra columns
  3. a category (matching values in 2nd crosstab parameter)
  4. a value

您没有 row_name 。使用窗口功能行名 WINDOW-TABLE rel = nofollow noreferrer> dense_rank()

You don't have a row_name. Add a surrogate row_name with the window function dense_rank().

您的问题仍然存在解释的空间。让我们添加示例行进行演示

Your question leaves room for interpretation. Let's add sample rows for demonstration:

INSERT INTO facts (eff_date, update_date, symbol_id, data_type_id, source_id)
VALUES
   (now(), now(), 1,  5, 'foo')
 , (now(), now(), 1,  6, 'foo')
 , (now(), now(), 1,  7, 'foo')
 , (now(), now(), 1,  6, 'bar')
 , (now(), now(), 1,  7, 'bar')
 , (now(), now(), 1, 23, 'bar')
 , (now(), now(), 1,  5, 'baz')
 , (now(), now(), 1, 23, 'baz');  -- only two rows for 'baz'



解释#1:第一个 N



您要列出 data_type_id 的前N个值(最小,如果还有更多),请为每个不同的(source_id,symbol_id,eff_date)

为此,需要综合的类别,可以与 row_number() 。产生对 crosstab()的输入的基本查询:

For this, you also need a synthetic category, can be synthesized with row_number(). The basic query to produce input to crosstab():

SELECT dense_rank() OVER (ORDER BY eff_date, symbol_id, source_id)::int AS row_name
     , eff_date, symbol_id, source_id                                   -- extra columns
     , row_number() OVER (PARTITION BY eff_date, symbol_id, source_id
                          ORDER BY data_type_id)::int                   AS category
     , data_type_id                                                     AS value  
FROM   facts
ORDER  BY row_name, category;

交叉表查询:

SELECT *
FROM   crosstab(
  'SELECT dense_rank() OVER (ORDER BY eff_date, symbol_id, source_id)::int AS row_name
        , eff_date, symbol_id, source_id                                   -- extra columns
        , row_number() OVER (PARTITION BY eff_date, symbol_id, source_id
                             ORDER BY data_type_id)::int                   AS category
        , data_type_id                                                     AS value  
   FROM   facts
   ORDER  BY row_name, category'
, 'VALUES (1), (2), (3)'
   ) AS (row_name int, eff_date timestamp, symbol_id int, source_id char(3)
       , datatype_1 int, datatype_2 int, datatype_3 int);

结果:



row_name | eff_date       | symbol_id | source_id | datatype_1 | datatype_2 | datatype_3
-------: | :--------------| --------: | :-------- | ---------: | ---------: | ---------:
       1 | 2017-04-10 ... |         1 | bar       |          6 |          7 |         23
       2 | 2017-04-10 ... |         1 | baz       |          5 |         23 |       null
       3 | 2017-04-10 ... |         1 | foo       |          5 |          6 |          7




解释#2:列名中的实际值



您要将 data_type_id 的实际值附加到列名 datatypeValue1,... DatatypeValueN 的列中。这些中的一个或多个:

Interpretation #2: actual values in column names

You want to append actual values of data_type_id to the column names datatypeValue1, ... DatatypeValueN. One ore more of these:

SELECT DISTINCT data_type_id FROM facts ORDER BY 1;

5,6,7,23 在这个例子。那么实际的显示值可以只是 boolean (或冗余值?)。基本查询:

5, 6, 7, 23 in the example. Then actual display values can be just boolean (or the redundant value?). Basic query:

SELECT dense_rank() OVER (ORDER BY eff_date, symbol_id, source_id)::int AS row_name
     , eff_date, symbol_id, source_id                                   -- extra columns
     , data_type_id                                                     AS category
     , TRUE                                                             AS value
FROM   facts
ORDER  BY row_name, category;

交叉表查询:

SELECT *
FROM   crosstab(
  'SELECT dense_rank() OVER (ORDER BY eff_date, symbol_id, source_id)::int AS row_name
        , eff_date, symbol_id, source_id                                   -- extra columns
        , data_type_id                                                     AS category
        , TRUE                                                             AS value
   FROM   facts
   ORDER  BY row_name, category'
, 'VALUES (5), (6), (7), (23)'  -- actual values
   ) AS (row_name int, eff_date timestamp, symbol_id int, source_id char(3)
       , datatype_5 bool, datatype_6 bool, datatype_7 bool, datatype_23 bool);

结果:



eff_date       | symbol_id | source_id | datatype_5 | datatype_6 | datatype_7 | datatype_23
:--------------| --------: | :-------- | :--------- | :--------- | :--------- | :----------
2017-04-10 ... |         1 | bar       | null       | t          | t          | t          
2017-04-10 ... |         1 | baz       | t          | null       | null       | t          
2017-04-10 ... |         1 | foo       | t          | t          | t          | null       


dbfiddle 此处

相关:

  • Crosstab function in Postgres returning a one row output when I expect multiple rows
  • Dynamic alternative to pivot with CASE and GROUP BY
  • Postgres - Transpose Rows to Columns

这篇关于具有多个“行名”的Postgresql交叉表查询列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆