计算PostgreSQL矩阵中列的组合 [英] count combination of columns in postgresql matrix

查看:53
本文介绍了计算PostgreSQL矩阵中列的组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在postgres中有一个表格,如下所示

我想在postgres中使用一个sql,该sql计数包含YY的两列的组合

期望

之类的输出

组合计数

  AB 2AC 1公元2年AZ 1公元前1年BD 3BZ 2CD 2捷克0DZ 1 

有人可以帮助我吗?

解决方案

 与堆叠式AS(选择编号,unnest(array ['A','B','C','D','Z'])AS col_name,unnest(array [a,b,c,d,z])AS col_value从测试t)SELECT组合,sum(cnt)AS计数从 (选择t1.id,t1.col_name ||t2.col_name AS组合,(当t1.col_value ='Y'并且t2.col_value ='Y'然后1 ELSE 0 END的情况)AS cnt从堆叠的t1内联接叠放t2开启t1.id = t2.idAND t1.col_name<t2.col_name)t3按组合分组订单组合 

收益

  |组合|数|| ------- + ------- ||AB |2 ||AC |1 ||广告|2 ||AZ |2 ||卑诗省|1 ||BD |3 ||BZ |2 ||CD |2 ||CZ |0 ||DZ |1 | 

用于取消旋转表的 unnest 配方来自 Stew的帖子,此处./p>


要计算3列中 YYY 的出现次数,您可以使用:

 与堆叠式AS(选择编号,unnest(array ['A','B','C','D','Z'])AS col_name,unnest(array [a,b,c,d,z])AS col_value从测试t)SELECT组合,sum(cnt)AS计数从 (选择t1.id,t1.col_name ||t2.col_name ||t3.col_name AS组合,(当t1.col_value ='Y'时AND t2.col_value ='Y'AND t3.col_value ='Y'然后1 ELSE 0 END)AS cnt从堆叠的t1内联接叠放t2开启t1.id = t2.id内联接叠放t3开启t1.id = t3.idAND t1.col_name<t2.col_name和t2.col_name<t3.col_name)t3按组合分组订单组合; 

产生

  |组合|数|| ------- + ------- ||ABC |0 ||ABD |1 ||ABZ |2 ||ACD |1 ||ACZ |0 ||ADZ |1 ||BCD |1 ||BCZ |0 ||BDZ |1 ||CDZ |0 | 

或者,要处理N列的组合,可以使用 WECU RECURSIVE :例如,对于 N = 3

 以RECURSIVE结果AS(与堆叠式AS(选择编号,unnest(array ['A','B','C','D','Z'])AS col_name,unnest(array [a,b,c,d,z])AS col_value从测试t)SELECT ID,数组[col_name] AS路径,数组[col_value] AS路径val,col_name AS姓氏从堆叠联盟SELECT r.id,路径||s.col_name,path_val ||s.col_value,s.col_name从结果r内连接叠放式开启r.id = s.idAND s.col_name>r.last_name在哪里array_length(r.path,1)<3)-将N的值更改为3SELECT组合,sum(cnt)从 (SELECT id,array_to_string(path,'')AS组合,(CASE WHEN'Y'= all(path_val)THEN 1 ELSE 0 END)AS cnt从结果WHERE array_length(path,1)= 3)t-将3更改为N的值按组合分组订单组合 

请注意,上面的SQL在2个地方使用了 N = 3 .

I have a table in postgres like below

I want an sql in postgres that count a combination of 2 columns that has YY

Expecting an output like

Combination Count

AB 2
AC 1
AD 2
AZ 1
BC 1
BD 3
BZ 2
CD 2
CZ 0
DZ 1

Can anyone help me?

解决方案

WITH stacked AS (
    SELECT id
        , unnest(array['A', 'B', 'C', 'D', 'Z']) AS col_name
        , unnest(array[a, b, c, d, z]) AS col_value
    FROM test t
)
SELECT combo, sum(cnt) AS count
FROM (
    SELECT t1.id, t1.col_name || t2.col_name AS combo
        , (CASE WHEN t1.col_value = 'Y' AND t2.col_value = 'Y' THEN 1 ELSE 0 END) AS cnt
    FROM stacked t1
    INNER JOIN stacked t2
    ON t1.id = t2.id
    AND t1.col_name < t2.col_name) t3
GROUP BY combo
ORDER BY combo

yields

| combo | count |
|-------+-------|
| AB    |     2 |
| AC    |     1 |
| AD    |     2 |
| AZ    |     2 |
| BC    |     1 |
| BD    |     3 |
| BZ    |     2 |
| CD    |     2 |
| CZ    |     0 |
| DZ    |     1 |

The unnesting recipe for unpivoting the table comes from Stew's post, here.


To count occurrances of YYY among 3 columns you could use:

WITH stacked AS (
    SELECT id
        , unnest(array['A', 'B', 'C', 'D', 'Z']) AS col_name
        , unnest(array[a, b, c, d, z]) AS col_value
    FROM test t
)
SELECT combo, sum(cnt) AS count
FROM (
    SELECT t1.id, t1.col_name || t2.col_name || t3.col_name AS combo
        , (CASE WHEN t1.col_value = 'Y' 
               AND t2.col_value = 'Y'
               AND t3.col_value = 'Y' THEN 1 ELSE 0 END) AS cnt
    FROM stacked t1
    INNER JOIN stacked t2
    ON t1.id = t2.id
    INNER JOIN stacked t3
    ON t1.id = t3.id
    AND t1.col_name < t2.col_name 
    And t2.col_name < t3.col_name
    ) t3
GROUP BY combo
ORDER BY combo
;

which yields

| combo | count |
|-------+-------|
| ABC   |     0 |
| ABD   |     1 |
| ABZ   |     2 |
| ACD   |     1 |
| ACZ   |     0 |
| ADZ   |     1 |
| BCD   |     1 |
| BCZ   |     0 |
| BDZ   |     1 |
| CDZ   |     0 |

Or, to handle combinations of N columns, you could use WITH RECURSIVE: For example, for N = 3,

WITH RECURSIVE result AS (
    WITH stacked AS (
        SELECT id
            , unnest(array['A', 'B', 'C', 'D', 'Z']) AS col_name
            , unnest(array[a, b, c, d, z]) AS col_value
        FROM test t)
    SELECT id, array[col_name] AS path, array[col_value] AS path_val, col_name AS last_name
    FROM stacked

    UNION

    SELECT r.id, path || s.col_name, path_val || s.col_value, s.col_name
    FROM result r
    INNER JOIN stacked s
    ON r.id = s.id
        AND s.col_name > r.last_name
    WHERE array_length(r.path, 1) < 3)  -- Change 3 to your value for N
SELECT combo, sum(cnt)
FROM (
    SELECT id, array_to_string(path, '') AS combo, (CASE WHEN 'Y' = all(path_val) THEN 1 ELSE 0 END) AS cnt
    FROM result
    WHERE array_length(path, 1) = 3) t  -- Change 3 to your value for N
GROUP BY combo
ORDER BY combo

Note that N = 3 is used in 2 places in the SQL above.

这篇关于计算PostgreSQL矩阵中列的组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆