相当于MySQL GROUP BY的PostgreSQL [英] PostgreSQL equivalent for MySQL GROUP BY

查看:339
本文介绍了相当于MySQL GROUP BY的PostgreSQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在表格中找到重复项目。在MySQL中,我简单地写了:

pre code $ SELECT $,count(id)count FROM`MY_TABLE`
GROUP BY SOME_COLUMN ORDER BY count DESC

这个查询很好:


  • 根据SOME_COLUMN查找重复项,并给出其重复次数。
  • 按重复次序排序,这对于快速扫描主要玩家很有用。

  • 为所有剩余列选择一个随机值,给我一个这些列中值的概念。



类似Postgres中的查询出现错误:


列MY_TABLE.SOME_COLUMN必须出现在GROUP BY子句中,或者是
用在一个聚合函数中

这个查询的Postgres相当于什么?



< PS:我知道MySQL的行为偏离了SQL标准。

解决方案

Back-ticks是一个非标准的MySQL事物。使用规范的双引号引用标识符(在MySQL中也可能)。也就是说,如果你的表名实际上被命名为MY_TABLE(全部大写)。如果您(更明智地)将它命名为 my_table (全部小写),那么您可以删除双引号或使用小写。



另外,我使用 ct 而不是 count 作为别名,因为使用函数是不好的做法名称作为标识符。



简单案例



这可以在PostgreSQL 9.1 中使用:

  SELECT *,count(id)ct 
FROM my_table
GROUP BY primary_key_column(s)
ORDER BY ct DESC;

它需要 GROUP BY 子句。结果与MySQL查询相同,但是 ct 总是1(或者如果 id为IS NULL <

按主键列以外的其他组合



如果你想通过其他列进行分组,事情变得更加复杂。这个查询模仿你的MySQL查询的行为 - 你可以使用 *



<$ (1,some_column)
count(*)OVER(PARTITION BY some_column)AS ct
,*
FROM my_table
ORDER BY 1 DESC,some_column,id,col1;

这是因为 DISTINCT ON (PostgreSQL特定)像 DISTINCT (SQL标准)一样,在窗口函数 count(*)OVER(..)之后应用。 。)窗口函数(使用 OVER 子句)需要PostgreSQL 8.4 或更高版本,并且在MySQL中不可用。



适用于任何表格,无论主要或唯一约束。

1在 DISTINCT ON > ORDER BY 只是简写,指的是项目在 SELECT 列表中。



SQL小提琴可以并排演示。

更多详细信息, p>






count(*) count(id)



如果您正在查找重复项,您最好使用 count(*)而不是 count(id)。如果 id 可以是 NULL ,因为 NULL 值不计算 - 而 count(*)计算所有行。如果 id 被定义为 NOT NULL ,结果相同,但 count(*)通常更合适(也更快)。


I need to find duplicates in a table. In MySQL I simply write:

SELECT *,count(id) count FROM `MY_TABLE`
GROUP BY SOME_COLUMN ORDER BY count DESC

This query nicely:

  • Finds duplicates based on SOME_COLUMN, giving its repetition count.
  • Sorts in desc order of repetition, which is useful to quickly scan major dups.
  • Chooses a random value for all remaining columns, giving me an idea of values in those columns.

Similar query in Postgres greets me with an error:

column "MY_TABLE.SOME_COLUMN" must appear in the GROUP BY clause or be used in an aggregate function

What is the Postgres equivalent of this query?

PS: I know that MySQL behaviour deviates from SQL standards.

解决方案

Back-ticks are a non-standard MySQL thing. Use the canonical double quotes to quote identifiers (possible in MySQL, too). That is, if your table in fact is named "MY_TABLE" (all upper case). If you (more wisely) named it my_table (all lower case), then you can remove the double quotes or use lower case.

Also, I use ct instead of count as alias, because it is bad practice to use function names as identifiers.

Simple case

This would work with PostgreSQL 9.1:

SELECT *, count(id) ct
FROM   my_table
GROUP  BY primary_key_column(s)
ORDER  BY ct DESC;

It requires primary key column(s) in the GROUP BY clause. The results are identical to a MySQL query, but ct would always be 1 (or 0 if id IS NULL) - useless to find duplicates.

Group by other than primary key columns

If you want to group by other column(s), things get more complicated. This query mimics the behavior of your MySQL query - and you can use *.

SELECT DISTINCT ON (1, some_column)
       count(*) OVER (PARTITION BY some_column) AS ct
      ,*
FROM   my_table
ORDER  BY 1 DESC, some_column, id, col1;

This works because DISTINCT ON (PostgreSQL specific), like DISTINCT (SQL-Standard), are applied after the window function count(*) OVER (...). Window functions (with the OVER clause) require PostgreSQL 8.4 or later and are not available in MySQL.

Works with any table, regardless of primary or unique constraints.

The 1 in DISTINCT ON and ORDER BY is just shorthand to refer to the ordinal number of the item in the SELECT list.

SQL Fiddle to demonstrate both side by side.

More details in this closely related answer:


count(*) vs. count(id)

If you are looking for duplicates, you are better off with count(*) than with count(id). There is a subtle difference if id can be NULL, because NULL values are not counted - while count(*) counts all rows. If id is defined NOT NULL, results are the same, but count(*) is generally more appropriate (and slightly faster, too).

这篇关于相当于MySQL GROUP BY的PostgreSQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆