一个组的SQL两个标准 [英] SQL two criteria from one group-by

查看：177 发布时间：2017/7/21 18:51:39 sql postgresql duplicates greatest-n-per-group postgresql-9.0

本文介绍了一个组的SQL两个标准的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个具有一些功能重复记录的表 - 不同的ID，但是用户数据（甚至更多的列）的4列是相同的。我有一个查询工作，将选择所有具有这种重复记录的记录。

现在我想从每组重复项中选择一个，列 A not null - 而且我已经从数据中验证了每个组最多有1个这样的行 - 如果在这个特定组中没有，那么最小的列 ID 。

如何选择？我不能在ELSE中的CASE和聚合的THEN中完全使用非聚合。例如。这不工作：

  SELECT CASE 
当dA为空时，d.ID 
 ELSE MIN （d.ID）END，
 dB，
 dC，
 dE，
 dF 
 FROM TABLE T 
 JOIN（这里的重复查询）D ON TB = DB 
 AND TC = DC 
 AND TE = DE 
 AND TF = DF 
 GROUP BY TB，
 TC，
 TE，
 TF

错误是：

列A必须出现在GROUP BY子句中或在聚合函数中使用。

解决方案

这可以是更简单的：

  SELECT DISTINCT ON（b，c ，e，f）
b，c，e，f，id  - 自由添加更多列
 FROM（< duplicate query here>）sub 
 ORDER BY b，c，e，f ，（a IS NOT NULL），id

您的重复查询具有所有列。

使用标准SQL的Postgres扩展名（不需要 JOIN code> DISTINCT ： DISTINCT ON ：
- 选择每个GROUP BY组中的第一行

Postgres有一个正确的布尔类型。您可以直接使用 ORDER BY 布尔表达式。（0）， TRUE （1）， NULL （NULL）。如果a为NULL，则此表达式为 FALSE 并先排序：（一个IS NOT NULL）。其余由 id 订购。 Voilá。

选择 ID 会自动发生。根据您的描述，您需要在此查询中选择的行的ID。没有什么可做的。

您可以直接将其整合到您的重复查询中。

I have a table with some "functionally duplicate" records - different IDs, but the 4 columns of "user data" (of even more columns) are identical. I've got a query working that will select all records that have such duplicates.

Now I want to select, from each group of duplicates, first any of them that have column A not null - and I've verified from the data that there are at most 1 such rows per group - and if there are none in this particular group, then the minimum of column ID.

How do I select that? I can't exactly use a non-aggregate in the THEN of a CASE and an aggregate in the ELSE. E.g. this doesn't work:

SELECT CASE
           WHEN d.A IS NULL THEN d.ID
           ELSE MIN(d.ID) END,
       d.B,
       d.C,
       d.E,
       d.F
FROM TABLE T
JOIN (my duplicate query here) D ON T.B=D.B
AND T.C=D.C
AND T.E=D.E
AND T.F=D.F
GROUP BY T.B,
         T.C,
         T.E,
         T.F

Error being:

column A must appear in the GROUP BY clause or be used in an aggregate function.

解决方案

This can be radically simpler:

SELECT DISTINCT ON (b, c, e, f)
       b, c, e, f, id   -- add more columns freely
FROM   (<duplicate query here>) sub
ORDER  BY b, c, e, f, (a IS NOT NULL), id

Your duplicate query has all columns. No need to JOIN to the base table again.
Use the Postgres extension of the standard SQL DISTINCT: DISTINCT ON:
- Select first row in each GROUP BY group?
Postgres has a proper boolean type. You can ORDER BY boolean expression directly. The sequence is FALSE (0), TRUE (1), NULL (NULL). If a is NULL, this expression is FALSE and sorts first: (a IS NOT NULL). The rest is ordered by id. Voilá.
Selection of ID happens automatically. According to your description you want the ID of the row selected in this query. Nothing more to do.
You can probably integrate this into your duplicate query directly.

这篇关于一个组的SQL两个标准的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

一个组的SQL两个标准 [英] SQL two criteria from one group-by

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

一个组的SQL两个标准 [英] SQL two criteria from one group-by

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭