一个组的SQL两个标准 [英] SQL two criteria from one group-by

查看:177
本文介绍了一个组的SQL两个标准的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有一些功能重复记录的表 - 不同的ID,但是用户数据(甚至更多的列)的4列是相同的。我有一个查询工作,将选择所有具有这种重复记录的记录。



现在我想从每组重复项中选择一个,列 A not null - 而且我已经从数据中验证了每个组最多有1个这样的行 - 如果在这个特定组中没有,那么最小的列 ID



如何选择?我不能在ELSE中的CASE和聚合的THEN中完全使用非聚合。例如。这不工作:

  SELECT CASE 
当dA为空时,d.ID
ELSE MIN (d.ID)END,
dB,
dC,
dE,
dF
FROM TABLE T
JOIN(这里的重复查询)D ON TB = DB
AND TC = DC
AND TE = DE
AND TF = DF
GROUP BY TB,
TC,
TE,
TF

错误是:


列A必须出现在GROUP BY子句中或在聚合函数中使用。



解决方案

这可以是更简单的

  SELECT DISTINCT ON(b,c ,e,f)
b,c,e,f,id - 自由添加更多列
FROM(< duplicate query here>)sub
ORDER BY b,c,e,f ,(a IS NOT NULL),id




  • 您的重复查询具有所有列。


  • 使用标准SQL的Postgres扩展名(不需要 JOIN code> DISTINCT : DISTINCT ON




  • Postgres有一个正确的布尔类型。您可以直接使用 ORDER BY 布尔表达式。 (0), TRUE (1), NULL (NULL)。如果a为NULL,则此表达式为 FALSE 并先排序:(一个IS NOT NULL)。其余由 id 订购。 Voilá。


  • 选择 ID 会自动发生。根据您的描述,您需要在此查询中选择的行的ID。没有什么可做的。


  • 您可以直接将其整合到您的重复查询中。



I have a table with some "functionally duplicate" records - different IDs, but the 4 columns of "user data" (of even more columns) are identical. I've got a query working that will select all records that have such duplicates.

Now I want to select, from each group of duplicates, first any of them that have column A not null - and I've verified from the data that there are at most 1 such rows per group - and if there are none in this particular group, then the minimum of column ID.

How do I select that? I can't exactly use a non-aggregate in the THEN of a CASE and an aggregate in the ELSE. E.g. this doesn't work:

SELECT CASE
           WHEN d.A IS NULL THEN d.ID
           ELSE MIN(d.ID) END,
       d.B,
       d.C,
       d.E,
       d.F
FROM TABLE T
JOIN (my duplicate query here) D ON T.B=D.B
AND T.C=D.C
AND T.E=D.E
AND T.F=D.F
GROUP BY T.B,
         T.C,
         T.E,
         T.F

Error being:

column A must appear in the GROUP BY clause or be used in an aggregate function.

解决方案

This can be radically simpler:

SELECT DISTINCT ON (b, c, e, f)
       b, c, e, f, id   -- add more columns freely
FROM   (<duplicate query here>) sub
ORDER  BY b, c, e, f, (a IS NOT NULL), id

  • Your duplicate query has all columns. No need to JOIN to the base table again.

  • Use the Postgres extension of the standard SQL DISTINCT: DISTINCT ON:

  • Postgres has a proper boolean type. You can ORDER BY boolean expression directly. The sequence is FALSE (0), TRUE (1), NULL (NULL). If a is NULL, this expression is FALSE and sorts first: (a IS NOT NULL). The rest is ordered by id. Voilá.

  • Selection of ID happens automatically. According to your description you want the ID of the row selected in this query. Nothing more to do.

  • You can probably integrate this into your duplicate query directly.

这篇关于一个组的SQL两个标准的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆