将查询中的单个列与多列聚合 [英] Aggregate a single column in query with many columns

查看:21
本文介绍了将查询中的单个列与多列聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当查询中有许多其他列时,是否有适当的方法来聚合单个列?

Is there a proper way to aggregate a single column when I have many other columns in the query?

我试过 这个答案 有效,但我的查询变得更加冗长.

I've tried this answer which works, but my query has become a lot more verbose.

我当前的查询如下所示:

My current query looks like this:

SELECT t1.foo1, t1.foo2, t2.foo3, t2.foo4, string_agg(t3.aggregated_field, ', ')
FROM tbl1 t1
LEFT JOIN tbl2 t2 ON t1.id = t2.fkeyid
LEFT JOIN tbl3 t3 ON t2.id = t3.fkeyid
GROUP BY t1.foo1, t1.foo2, t2.foo3, t2.foo4, t2.foo5, t2.foo6
ORDER BY t2.foo5, t2.foo6

查询有更多的字段和LEFT JOIN,重要的部分是所有这些字段都有 1 到 1 或 1 到 0 的关系,除了我想要的 1 到 n 的一个字段聚合,由上面伪查询中的 t3.aggregated_field 表示.

The query has many more fields and LEFT JOINs, the important part is that all these fields have 1 to 1 or 1 to 0 relationship except one field that is 1 to n which I want to aggregate, represented by t3.aggregated_field in the pseudo-query above.

当我使用聚合函数时,SELECTORDER BY 中列出的所有字段必须是聚合的或 GROUP BY 的一部分代码> 子句.这使我的查询方式比现在更加冗长.

As I'm using an aggregate function, all fields listed in the SELECT and ORDER BY must be either aggregated or part of the GROUP BY clause. This makes my query way more verbose than it already is.

即假设foo1为主键,当该字段重复时,除aggregated_field外的其他字段也相等.我希望这些重复的行作为带有聚合字段值的单行结果.(基本上是带有聚合列的 select distinct)

That is, assuming foo1 is a primary key, when this field is repeated, all others except aggregated_field are also equal. I want these repeated rows as a single row result with the aggregated field value. (basically a select distinct with an aggregated column)

是否有更好的方法来做到这一点(无需将所有其他字段放在 GROUP BY 中),或者我应该在后端迭代结果集,为每个字段执行查询获取这个 1 到 n 关系的行?

Is there a better way to do this (without having to put all other fields in the GROUP BY) or should I just iterate over the result set in my back-end executing a query for each row fetching this 1 to n relationship?

服务器正在运行 PostgreSQL 9.1.9,更具体地说:

The server is running PostgreSQL 9.1.9, more specifically:

PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu,由 gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54) 编译,64 位

PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-54), 64-bit

推荐答案

简单查询

使用 PostgreSQL 9.1 或更高版本,这可以简单得多.正如在这个密切相关的答案中所解释的:

Simple query

This can be much simpler with PostgreSQL 9.1 or later. As explained in this closely related answer:

GROUP BY一个表的主键就足够了.由于:

It is enough to GROUP BY the primary key of a table. Since:

foo1 是主键

.. 您可以将示例简化为:

.. you can simplify your example to:

SELECT foo1, foo2, foo3, foo4, foo5, foo6, string_agg(aggregated_field, ', ')
FROM   tbl1
GROUP  BY 1
ORDER  BY foo7, foo8;  -- have to be spelled out, since not in select list!

多表查询

但是,既然您已经:

Query with multiple tables

However, since you have:

更多的字段和 LEFT JOIN,重要的部分是所有这些字段都具有 1 到 1 或 1 到 0 的关系,除了我想要聚合的 1 到 n 的一个字段

many more fields and LEFT JOINs, the important part is that all these fields have 1 to 1 or 1 to 0 relationship except one field that is 1 to n which I want to aggregate

..先聚合,后加入应该更快更简单:

SELECT t1.foo1, t1.foo2, ...
     , t2.bar1, t2.bar2, ...
     , a.aggregated_col 
FROM   tbl1 t1
LEFT   JOIN tbl2 t2 ON ...
...
LEFT   JOIN (
   SELECT some_id, string_agg(agg_col, ', ') AS aggregated_col
   FROM   agg_tbl a ON ...
   GROUP  BY some_id
   ) a ON a.some_id = ?.some_id
ORDER  BY ...

这样,您查询的大部分内容根本不需要聚合.

This way the big portion of your query does not need aggregation at all.

我最近在 SQL Fiddle 中提供了一个测试用例来证明这个相关答案中的观点:

I recently provided a test case in an SQL Fiddle to prove the point in this related answer:

由于您指的是 此相关答案:不,DISTINCT 不会在这种情况下完全没有帮助.

Since you are referring to this related answer: No, DISTINCT is not going to help at all in this case.

这篇关于将查询中的单个列与多列聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆