基于多个列的分组聚合 [英] Aggregating based on GROUPING of multiple columns

查看:56
本文介绍了基于多个列的分组聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在尝试执行具有多个联接的初始查询后,我尝试在SQL中进行子查询和聚合.我的最终目标是基于多个列的分组来获得测试样本的计数(或总和).这与.有没有办法计算总和?我试图通过将count存储到一个子查询中,然后引用该count变量来做到这一点,但在如何对要显示的8列数据进行分组以及该列组合的计数时,我缺少了一些东西.>

谢谢!

解决方案

只需将 analysis group by 子句中删除,因为那是您要使用其不同值的列 count .否则,查询将生成比您需要的组更多的组(并且每个组中不同的 analysis 值的计数始终为 1 ).

 总计为(...)SELECT COUNT(DISTINCT Analysis)作为SPECIMEN_COUNT,DATE_STARTED,ANALYSIS,STATUS,GROUP,ANALYSIS_TYPE从TALLYGROUP BY DATE_STARTED,状态,GROUP,TYPE_TYPEORDER BY DATE_STARTED; 

I am trying to subquery and aggregate in SQL after doing an initial query with multiple joins. My ultimate goal is to get a count (or a sum) of specimens tested based on a grouping of multiple columns. This is slightly different from SQL Server query - Selecting COUNT(*) with DISTINCT and SQL Server: aggregate error on grouping.

The three tables that I use (PERSON, SPECIMEN, TEST), have 1-many relationships. So PERSON has many SPECIMENS and those SPECIMENS have many TESTS. I did three inner joins to combine these tables plus an additional table (ANALYSIS).

WITH TALLY as (
SELECT PERSON.NAME, PERSON.PHASE, TEST.DATE_STARTED, TEST.ANALYSIS, SPECIMEN.GROUP, TEST.STATUS,
 ANALYSIS.ANALYSIS_TYPE, SPECIMEN.SPECIMEN_NUMBER
    FROM DB.TEST
    INNER JOIN
    DB.SAMPLE ON
    TEST.SPECIMEN_NUMBER = SPECIMEN.SPECIMEN_NUMBER
    INNER JOIN 
    DB.PRODUCT ON
    SPECIMEN.PERSON = PERSON.NAME
    INNER JOIN
    DB.ANALYSIS ON
    TEST.ANALYSIS = ANALYSIS.NAME
    WHERE PERSON.NAME = 'Joe'
    AND TEST.DATE_STARTED >= '20-DEC-16' AND TEST.DATE_STARTED <='01-APR-18'
    AND PERSON.PHASE = 'PHASE1'
    ORDER BY TEST.DATE_STARTED)
    
    SELECT COUNT(DISTINCT ANALYSIS) as SPECIMEN_COUNT, DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
    FROM TALLY 
    GROUP BY DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
    ORDER BY DATE_STARTED;

This gives me the repeated columns: first grouping repeated 4 times

What I am trying to see is: aggregated first grouping with total count

Any thoughts as to what is missing? SUM instead of COUNT or in addition to COUNT creates an error. Thanks in advance!

9/17/2020 Update: I have tried adding a subquery because I also need to use a new column of metadata (ANALYSIS_TYPE_ALIAS) which is created in the first query through a CASE STATEMENT(...). I have also tried using another subquery with inner join to count based on those conditions to a temp table, but still cannot seem to aggregate to flatten the table. Here is my current attempt:

WITH TALLY as (
SELECT PERSON.NAME, PERSON.PHASE, TEST.DATE_STARTED, TEST.ANALYSIS, SPECIMEN.GROUP, TEST.STATUS,
 ANALYSIS.ANALYSIS_TYPE...
    FROM DB.TEST
    INNER JOIN
    DB.SAMPLE ON
    TEST.SPECIMEN_NUMBER = SPECIMEN.SPECIMEN_NUMBER
    INNER JOIN 
    DB.PRODUCT ON
    SPECIMEN.PERSON = PERSON.NAME
    INNER JOIN
    DB.ANALYSIS ON
    TEST.ANALYSIS = ANALYSIS.NAME
    WHERE PERSON.NAME = 'Joe'
    AND TEST.DATE_STARTED >= '20-DEC-16' AND TEST.DATE_STARTED <='01-APR-18'
    AND PERSON.PHASE = 'PHASE1'
    ORDER BY TEST.DATE_STARTED),
SUMMARY_COMBO AS (SELECT DISTINCT(CONCAT(CONCAT(CONCAT(CONCAT(ANALYSIS, DATE_STARTED),STATUS), GROUP), ANALYSIS_TYPE_ALIAS))AS UUID,
TALLY.NAME, TALLY.PHASE, TALLY.DATE_STARTED, TALLY.ANALYSIS, TALLY.GROUP, TALLY.STATUS, TALLY.ANALYSIS_TYPE_ALIAS
FROM TALLY)
SELECT SUMMARY_COMBO.NAME, SUMMARY_COMBO.PHASE, SUMMARY_COMBO.DATE_STARTED, SUMMARY_COMBO.ANALYSIS,SUMMARY_COMBO.GROUP, SUMMARY_COMBO.STATUS, SUMMARY_COMBO.ANALYSIS_TYPE_ALIAS,
COUNT(SUMMARY_COMBO.ANALYSIS) OVER (PARTITION BY SUMMARY_COMBO.UUID) AS SPECIMEN_COUNT
FROM SUMMARY_COMBO 
ORDER BY SUMMARY_COMBO.DATE_STARTED;

This gave me the following table Shows aggregated counts, but doesn't aggregate based on unique UUID. Is there a way to take the sum of the count? I've tried to do this by storing count to a subquery and then referencing that count variable, but I am missing something in how to group the 8 columns of data that I want to show + the count of that combination of columns.

Thanks!

解决方案

Just remove analysis from the group by clause, since that's the column whose distinct values you want to count. Otherwise, the query generates more groups than what you need (and the count of distinct analysis values in each group is always 1).

WITH TALLY as ( ...)
SELECT COUNT(DISTINCT ANALYSIS) as SPECIMEN_COUNT, DATE_STARTED, ANALYSIS, STATUS, GROUP, ANALYSIS_TYPE
FROM TALLY 
GROUP BY DATE_STARTED, STATUS, GROUP, ANALYSIS_TYPE
ORDER BY DATE_STARTED;

这篇关于基于多个列的分组聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆