单个查询中的多个 array_agg() 调用 [英] Multiple array_agg() calls in a single query

查看:32
本文介绍了单个查询中的多个 array_agg() 调用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试用我的查询完成一些事情,但它并没有真正起作用.我的应用程序曾经有一个 mongo db,因此该应用程序用于获取字段中的数组,现在我们不得不更改为 Postgres,我不想更改我的应用程序代码以保持 v1 正常工作.

I'm trying to accomplish something with my query but it's not really working. My application used to have a mongo db so the application is used to get arrays in a field, now we had to change to Postgres and I don't want to change my applications code to keep v1 working.

为了在 Postgres 中的 1 个字段中获取数组,我使用了 array_agg() 函数.到目前为止,这工作得很好.但是,我现在需要另一个不同表的字段中的另一个数组.

In order to get arrays in 1 field within Postgres I used array_agg() function. And this worked fine so far. However, I'm at a point where I need another array in a field from another different table.

例如:

我有我的员工.员工有多个地址并有多个工作日.

I have my employees. employees have multiple address and have multiple workdays.

SELECT name, age, array_agg(ad.street) FROM employees e 
JOIN address ad ON e.id = ad.employeeid
GROUP BY name, age

现在这对我来说很好用,这会导致例如:

Now this worked fine for me, this would result in for example:

| name  | age| array_agg(ad.street)
| peter | 25 | {1st street, 2nd street}|

现在我想加入另一个工作日的桌子所以我这样做:

Now I want to join another table for working days so I do:

SELECT name, age, array_agg(ad.street), arrag_agg(wd.day) FROM employees e 
JOIN address ad ON e.id = ad.employeeid 
JOIN workingdays wd ON e.id = wd.employeeid
GROUP BY name, age

这导致:

| peter | 25 | {1st street, 1st street, 1st street, 1st street, 1st street, 2nd street, 2nd street, 2nd street, 2nd street, 2nd street}| "{Monday,Tuesday,Wednesday,Thursday,Friday,Monday,Tuesday,Wednesday,Thursday,Friday}

但我需要它来产生结果:

But I need it to result:

| peter | 25 | {1st street, 2nd street}| {Monday,Tuesday,Wednesday,Thursday,Friday}

我知道这与我的连接有关,因为有多个连接,多个行但我不知道如何实现,谁能给我正确的提示?

I understand it has to do with my joins, because of the multiple joins the rows multiple but I don't know how to accomplish this, can anyone give me the correct tip?

推荐答案

DISTINCT 通常用于修复从内部腐烂的查询,而且这些查询通常很慢和/或不正确.开始时不要将行相乘,这样您就不必在最后整理出不需要的重复项.

DISTINCT is often applied to repair queries that are rotten from the inside, and that's often slow and / or incorrect. Don't multiply rows to begin with, then you don't have to sort out unwanted duplicates at the end.

一次加入多个 n 表(有很多")会使结果集中的行相乘.这就像 CROSS JOIN笛卡尔积通过代理:

Joining to multiple n-tables ("has many") at once multiplies rows in the result set. That's like a CROSS JOIN or Cartesian product by proxy:

有多种方法可以避免这个错误.

There are various ways to avoid this mistake.

从技术上讲,只要您在聚合之前一次加入具有多行的一个表,查询就可以工作:

Technically, the query works as long as you join to one table with multiple rows at a time before you aggregate:

SELECT e.id, e.name, e.age, e.streets, arrag_agg(wd.day) AS days
FROM  (
   SELECT e.id, e.name, e.age, array_agg(ad.street) AS streets
   FROM   employees e 
   JOIN   address  ad ON ad.employeeid = e.id
   GROUP  BY e.id    -- id enough if it is defined PK
   ) e
JOIN   workingdays wd ON wd.employeeid = e.id
GROUP  BY e.id, e.name, e.age;

最好也包括主键 idGROUP BY 它,因为 nameage 是不一定是唯一的.您可能会错误地合并两名员工.

It's also best to include the primary key id and GROUP BY it, because name and age are not necessarily unique. You could merge two employees by mistake.

但是您可以在加入之前在子查询中聚合,除非您对employees有选择性的WHERE条件:

But you can aggregate in a subquery before you join, that's superior unless you have selective WHERE conditions on employees:

SELECT e.id, e.name, e.age, ad.streets, arrag_agg(wd.day) AS days
FROM   employees e 
JOIN  (
   SELECT employeeid, array_agg(ad.street) AS streets
   FROM   address
   GROUP  BY 1
   ) ad ON ad.employeeid = e.id
JOIN   workingdays wd ON e.id = wd.employeeid
GROUP  BY e.id, e.name, e.age, ad.streets;

或合并两者:

SELECT name, age, ad.streets, wd.days
FROM   employees e 
JOIN  (
   SELECT employeeid, array_agg(ad.street) AS streets
   FROM   address
   GROUP  BY 1
   ) ad ON ad.employeeid = e.id
JOIN  (
   SELECT employeeid, arrag_agg(wd.day) AS days
   FROM   workingdays
   GROUP  BY 1
   ) wd ON wd.employeeid = e.id;

如果您检索基表中的所有或大部分行,最后一个通常更快.

The last one is typically faster if you retrieve all or most of the rows in the base tables.

请注意,使用 JOIN 而不是 LEFT JOIN 会从结果中删除没有地址没有工作日的员工.这可能是也可能不是.切换到LEFT JOIN 以保留所有 员工在结果中.

Note that using JOIN and not LEFT JOIN removes employees from the result who have no address or no workingdays. That may or may not be intended. Switch to LEFT JOIN to retain all employees in the result.

对于小选择,我会考虑相关子查询:

For a small selection, I would consider correlated subqueries instead:

SELECT name, age
    , (SELECT array_agg(street) FROM address WHERE employeeid = e.id) AS streets
    , (SELECT arrag_agg(day) FROM workingdays WHERE employeeid = e.id) AS days
FROM   employees e
WHERE  e.namer = 'peter';  -- very selective

或者,对于 Postgres 9.3 或更高版本,您可以使用 LATERAL 连接:

Or, with Postgres 9.3 or later, you can use LATERAL joins for that:

SELECT e.name, e.age, a.streets, w.days
FROM   employees e
LEFT   JOIN LATERAL (
   SELECT array_agg(street) AS streets
   FROM   address
   WHERE  employeeid = e.id
   GROUP  BY 1
   ) a ON true
LEFT   JOIN LATERAL (
   SELECT array_agg(day) AS days
   FROM   workingdays
   WHERE  employeeid = e.id
   GROUP  BY 1
   ) w ON true
WHERE  e.name = 'peter';  -- very selective

  • 什么是PostgreSQL 中 LATERAL 和子查询的区别?
  • 任一查询都会在结果中保留所有员工.

    Either query retains all employees in the result.

    这篇关于单个查询中的多个 array_agg() 调用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆