LATERAL JOIN 和 PostgreSQL 中的子查询有什么区别? [英] What is the difference between LATERAL JOIN and a subquery in PostgreSQL?
问题描述
自从 Postgres 能够进行 LATERAL
连接以来,我一直在阅读它,因为我目前为我的团队执行复杂的数据转储,其中包含许多低效的子查询,这使得整体查询需要四分钟或更长时间.
Since Postgres came out with the ability to do LATERAL
joins, I've been reading up on it, since I currently do complex data dumps for my team with lots of inefficient subqueries that make the overall query take four minutes or more.
我知道 LATERAL
连接可能对我有帮助,但即使阅读了像 这个来自堆分析,我仍然不太了解.
I understand that LATERAL
joins may be able to help me, but even after reading articles like this one from Heap Analytics, I still don't quite follow.
LATERAL
连接的用例是什么?LATERAL
连接和子查询有什么区别?
What is the use case for a LATERAL
join? What is the difference between a LATERAL
join and a subquery?
推荐答案
什么是LATERAL
join?
该功能是在 PostgreSQL 9.3 中引入的.
引用手册:
出现在FROM
中的子查询可以在关键字前面<代码>横向代码>.这允许他们引用前面提供的列FROM
项.(没有 LATERAL
,每个子查询都会被评估独立,因此不能交叉引用任何其他 FROM
项目.)
Subqueries appearing in
FROM
can be preceded by the key wordLATERAL
. This allows them to reference columns provided by precedingFROM
items. (WithoutLATERAL
, each subquery is evaluated independently and so cannot cross-reference any otherFROM
item.)
出现在FROM
中的表函数也可以在key之前word LATERAL
,但对于函数,关键字是可选的;这函数的参数可以包含对由提供的列的引用在任何情况下都位于 FROM
项之前.
Table functions appearing in FROM
can also be preceded by the key
word LATERAL
, but for functions the key word is optional; the
function's arguments can contain references to columns provided by
preceding FROM
items in any case.
那里给出了基本的代码示例.
Basic code examples are given there.
LATERAL
连接更像是相关子查询,而不是一个普通的子查询,在 LATERAL
连接右侧的表达式对其左侧的每一行进行评估 - 就像一个 correlated 子查询 - 而一个普通的子查询(表表达式)仅被评估一次.(不过,查询计划器有办法优化两者的性能.)
相关答案与代码示例并排解决相同的问题:
A LATERAL
join is more like a correlated subquery, not a plain subquery, in that expressions to the right of a LATERAL
join are evaluated once for each row left of it - just like a correlated subquery - while a plain subquery (table expression) is evaluated once only. (The query planner has ways to optimize performance for either, though.)
Related answer with code examples for both side by side, solving the same problem:
对于返回多列,LATERAL
连接通常更简单、更清晰、更快.
另外,请记住,相关子查询的等效项是 LEFT JOIN LATERAL ... ON true
:
For returning more than one column, a LATERAL
join is typically simpler, cleaner and faster.
Also, remember that the equivalent of a correlated subquery is LEFT JOIN LATERAL ... ON true
:
一些事情是 LATERAL
连接可以做的,但是(相关的)子查询不能(很容易).相关子查询只能返回单个值,不能返回多列和多行——除了裸函数调用(如果返回多行,结果行就会相乘).但即使某些集合返回函数也只允许在 FROM
子句中使用.类似于 Postgres 9.4 或更高版本中带有多个参数的 unnest()
.手册:
There are things that a LATERAL
join can do, but a (correlated) subquery cannot (easily). A correlated subquery can only return a single value, not multiple columns and not multiple rows - with the exception of bare function calls (which multiply result rows if they return multiple rows). But even certain set‑returning functions are only allowed in the FROM
clause. Like unnest()
with multiple parameters in Postgres 9.4 or later. The manual:
这只允许在FROM
子句中;
所以这有效,但不能(很容易)用子查询替换:
So this works, but cannot (easily) be replaced with a subquery:
CREATE TABLE tbl (a1 int[], a2 int[]);
SELECT * FROM tbl, unnest(a1, a2) u(elem1, elem2); -- implicit LATERAL
FROM
子句中的逗号 (,
) 是 CROSS JOIN
的缩写.LATERAL
自动用于表函数.
关于UNNEST( array_expression [, ... ] )
的特殊情况:
The comma (,
) in the FROM
clause is short notation for CROSS JOIN
.
LATERAL
is assumed automatically for table functions.
About the special case of UNNEST( array_expression [, ... ] )
:
您也可以直接在 SELECT
列表中使用返回设置的函数,例如 unnest()
.在 Postgres 9.6 之前,这曾经在同一个 SELECT
列表中使用多个这样的函数表现出令人惊讶的行为.但它终于被 Postgres 10 净化了 并且现在是一个有效的替代方案(即使不是标准 SQL).见:
You can also use set-returning functions like unnest()
in the SELECT
list directly. This used to exhibit surprising behavior with more than one such function in the same SELECT
list up to Postgres 9.6. But it has finally been sanitized with Postgres 10 and is a valid alternative now (even if not standard SQL). See:
以上面的例子为基础:
SELECT *, unnest(a1) AS elem1, unnest(a2) AS elem2
FROM tbl;
对比:
dbfiddle for pg 9.6 这里
dbfiddle for pg 10 这里
对于 INNER
和 OUTER
连接类型,连接条件必须是指定,即 NATURAL
、ON
join_condition 中的一种,或 USING
(join_column [, ...]).含义见下文.
对于CROSS JOIN
,这些子句都不能出现.
For the
INNER
andOUTER
join types, a join condition must be specified, namely exactly one ofNATURAL
,ON
join_condition, orUSING
(join_column [, ...]). See below for the meaning.
ForCROSS JOIN
, none of these clauses can appear.
所以这两个查询是有效的(即使不是特别有用):
So these two queries are valid (even if not particularly useful):
SELECT *
FROM tbl t
LEFT JOIN LATERAL (SELECT * FROM b WHERE b.t_id = t.t_id) t ON TRUE;
SELECT *
FROM tbl t, LATERAL (SELECT * FROM b WHERE b.t_id = t.t_id) t;
虽然这个不是:
SELECT *
FROM tbl t
LEFT JOIN LATERAL (SELECT * FROM b WHERE b.t_id = t.t_id) t;
这就是为什么 Andomar 的 代码示例是正确的(CROSS JOIN
不需要连接条件)而 Attila's is 不是.
That's why Andomar's code example is correct (the CROSS JOIN
does not require a join condition) and Attila's is was not.
这篇关于LATERAL JOIN 和 PostgreSQL 中的子查询有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!