单个查询中有多个array_agg()调用 [英] Multiple array_agg() calls in a single query
问题描述
我正在尝试通过查询完成某些操作,但实际上并没有用。我的应用程序以前有一个mongo数据库,因此该应用程序用于在字段中获取数组,现在我们不得不更改为Postgres,并且我不想更改我的应用程序代码以保持v1的正常运行。
I'm trying to accomplish something with my query but it's not really working. My application used to have a mongo db so the application is used to get arrays in a field, now we had to change to Postgres and I don't want to change my applications code to keep v1 working.
为了在Postgres中的1个字段中获得数组,我使用了 array_agg()
函数。到目前为止,这个工作还不错。但是,我正要在另一个表中的字段中需要另一个数组。
In order to get arrays in 1 field within Postgres I used array_agg()
function. And this worked fine so far. However, I'm at a point where I need another array in a field from another different table.
例如:
我有我的员工。员工有多个地址,并且有多个工作日。
I have my employees. employees have multiple address and have multiple workdays.
SELECT name, age, array_agg(ad.street) FROM employees e
JOIN address ad ON e.id = ad.employeeid
GROUP BY name, age
现在这对我来说很好例如:
Now this worked fine for me, this would result in for example:
| name | age| array_agg(ad.street)
| peter | 25 | {1st street, 2nd street}|
现在我想在工作日加入另一张桌子,所以我这样做:
Now I want to join another table for working days so I do:
SELECT name, age, array_agg(ad.street), arrag_agg(wd.day) FROM employees e
JOIN address ad ON e.id = ad.employeeid
JOIN workingdays wd ON e.id = wd.employeeid
GROUP BY name, age
这将导致:
| peter | 25 | {1st street, 1st street, 1st street, 1st street, 1st street, 2nd street, 2nd street, 2nd street, 2nd street, 2nd street}| "{Monday,Tuesday,Wednesday,Thursday,Friday,Monday,Tuesday,Wednesday,Thursday,Friday}
但我需要得到结果:
| peter | 25 | {1st street, 2nd street}| {Monday,Tuesday,Wednesday,Thursday,Friday}
我知道这与我的联接有关,因为有多个联接多个行,但我不知道该如何完成,有人可以给我正确的提示吗?
I understand it has to do with my joins, because of the multiple joins the rows multiple but I don't know how to accomplish this, can anyone give me the correct tip?
推荐答案
DISTINCT
通常用于修复从内部腐烂的查询,这通常很慢和/或不正确,不必先增加行,然后就不必排序
DISTINCT
is often applied to repair queries that are rotten from the inside, and that's often slow and / or incorrect. Don't multiply rows to begin with, then you don't have to sort out unwanted duplicates at the end.
一次加入多个n表(有很多)将结果集中的行相乘,就像 CROSS JOIN
或笛卡尔产品 通过代理人:
Joining to multiple n-tables ("has many") at once multiplies rows in the result set. That's like a CROSS JOIN
or Cartesian product by proxy:
- Two SQL LEFT JOINS produce incorrect result
有多种方法可以避免此错误。
There are various ways to avoid this mistake.
从技术上讲,只要您加入 one ,该查询就起作用表格中一次汇总多个行:
Technically, the query works as long as you join to one table with multiple rows at a time before you aggregate:
SELECT e.id, e.name, e.age, e.streets, arrag_agg(wd.day) AS days
FROM (
SELECT e.id, e.name, e.age, array_agg(ad.street) AS streets
FROM employees e
JOIN address ad ON ad.employeeid = e.id
GROUP BY e.id -- id enough if it is defined PK
) e
JOIN workingdays wd ON wd.employeeid = e.id
GROUP BY e.id, e.name, e.age;
最好也包含主键 id
和 GROUP BY
它,因为名称
和 age
不一定独特。您可能会错误地合并两个雇员。
It's also best to include the primary key id
and GROUP BY
it, because name
and age
are not necessarily unique. You could merge two employees by mistake.
但是您可以在加入之前在子查询中进行汇总,除非您有选择地在
雇员
的条件:
But you can aggregate in a subquery before you join, that's superior unless you have selective WHERE
conditions on employees
:
SELECT e.id, e.name, e.age, ad.streets, arrag_agg(wd.day) AS days
FROM employees e
JOIN (
SELECT employeeid, array_agg(ad.street) AS streets
FROM address
GROUP BY 1
) ad ON ad.employeeid = e.id
JOIN workingdays wd ON e.id = wd.employeeid
GROUP BY e.id, e.name, e.age, ad.streets;
或将两者总计:
SELECT name, age, ad.streets, wd.days
FROM employees e
JOIN (
SELECT employeeid, array_agg(ad.street) AS streets
FROM address
GROUP BY 1
) ad ON ad.employeeid = e.id
JOIN (
SELECT employeeid, arrag_agg(wd.day) AS days
FROM workingdays
GROUP BY 1
) wd ON wd.employeeid = e.id;
最后一个通常更快,如果您检索全部或大部分
请注意,请使用 JOIN
而不是 LEFT JOIN
从结果中删除没有地址或没有工作日的员工。这可能是预期的,也可能不是预期的。切换到 LEFT JOIN
保留结果中的 all 名员工。
Note that using JOIN
and not LEFT JOIN
removes employees from the result who have no address or no workingdays. That may or may not be intended. Switch to LEFT JOIN
to retain all employees in the result.
对于少量选择,我会考虑使用相关子查询:
For a small selection, I would consider correlated subqueries instead:
SELECT name, age
, (SELECT array_agg(street) FROM address WHERE employeeid = e.id) AS streets
, (SELECT arrag_agg(day) FROM workingdays WHERE employeeid = e.id) AS days
FROM employees e
WHERE e.namer = 'peter'; -- very selective
或者,对于Postgres 9.3或更高版本,您可以使用横向连接
为此:
Or, with Postgres 9.3 or later, you can use LATERAL
joins for that:
SELECT e.name, e.age, a.streets, w.days
FROM employees e
LEFT JOIN LATERAL (
SELECT array_agg(street) AS streets
FROM address
WHERE employeeid = e.id
GROUP BY 1
) a ON true
LEFT JOIN LATERAL (
SELECT array_agg(day) AS days
FROM workingdays
WHERE employeeid = e.id
GROUP BY 1
) w ON true
WHERE e.name = 'peter'; -- very selective
- 在PostgreSQL中,LATERAL和子查询之间有什么区别?
- What is the difference between LATERAL and a subquery in PostgreSQL?
这两个查询都会在结果中保留 all 个雇员。
Either query retains all employees in the result.
这篇关于单个查询中有多个array_agg()调用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!