在 PostgreSQL 中,散列子计划是什么意思? [英] In PostgreSQL what does hashed subplan mean?

查看:69
本文介绍了在 PostgreSQL 中,散列子计划是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道优化器如何重写查询以及如何读取PostgreSQL中的执行计划这是示例代码.

I want to know how the optimizer rewrote the query and how to read the execution plan in PostgreSQL Here is the sample code.

DROP TABLE ords;
CREATE TABLE ords (
ORD_ID INT NOT NULL,
ORD_PROD_ID VARCHAR(2) NOT NULL,
ETC_CONTENT VARCHAR(100));
ALTER TABLE ords ADD CONSTRAINT ords_PK PRIMARY KEY(ORD_ID);
CREATE INDEX ords_X01 ON ords(ORD_PROD_ID);
INSERT INTO ords
SELECT i
      ,chr(64+case when i <= 10 then i else 26 end)
      ,rpad('x',100,'x')
  FROM generate_series(1,10000) a(i);

SELECT COUNT(*) FROM ords WHERE ORD_PROD_ID IN ('A','B','C');

DROP TABLE delivery;
CREATE TABLE delivery (
ORD_ID INT NOT NULL,
VEHICLE_ID VARCHAR(2) NOT NULL,
ETC_REMARKS VARCHAR(100));
ALTER TABLE delivery ADD CONSTRAINT delivery_PK primary key (ORD_ID, VEHICLE_ID);
CREATE INDEX delivery_X01 ON delivery(VEHICLE_ID);
INSERT INTO delivery
SELECT i
     , chr(88 + case when i <= 10 then mod(i,2) else 2 end)
     , rpad('x',100,'x')
  FROM generate_series(1,10000) a(i);

analyze ords;
analyze delivery;

这是我感兴趣的 SQL.

This is the SQL I am interested in.

SELECT *
  FROM ords a
 WHERE ( EXISTS (SELECT 1
                   FROM delivery b
                  WHERE a.ORD_ID = b.ORD_ID
                    AND b.VEHICLE_ID IN ('X','Y')
                 )
         OR a.ORD_PROD_ID IN ('A','B','C')
         );
Here is the execution plan
| Seq Scan on portal.ords a (actual time=0.038..2.027 rows=10 loops=1)                                           |
|   Output: a.ord_id, a.ord_prod_id, a.etc_content                                                               |
|   Filter: ((alternatives: SubPlan 1 or hashed SubPlan 2) OR ((a.ord_prod_id)::text = ANY ('{A,B,C}'::text[]))) |
|   Rows Removed by Filter: 9990                                                                                 |
|   Buffers: shared hit=181                                                                                      |
|   SubPlan 1                                                                                                    |
|     ->  Index Only Scan using delivery_pk on portal.delivery b (never executed)                                |
|           Index Cond: (b.ord_id = a.ord_id)                                                                    |
|           Filter: ((b.vehicle_id)::text = ANY ('{X,Y}'::text[]))                                               |
|           Heap Fetches: 0                                                                                      |
|   SubPlan 2                                                                                                    |
|     ->  Index Scan using delivery_x01 on portal.delivery b_1 (actual time=0.023..0.025 rows=10 loops=1)        |
|           Output: b_1.ord_id                                                                                   |
|           Index Cond: ((b_1.vehicle_id)::text = ANY ('{X,Y}'::text[]))                                         |
|           Buffers: shared hit=8                                                                                |
| Planning:                                                                                                      |
|   Buffers: shared hit=78                                                                                       |
| Planning Time: 0.302 ms                                                                                        |
| Execution Time: 2.121 ms  

我不知道优化器如何转换 SQL.优化器重写的最终 SQL 是什么?我上面的SQL中只有一个EXISTS子查询,为什么有两个子计划?什么是散列子计划 2"?吝啬的?如果有人与我分享一点知识,我将不胜感激.

I don't know how the optimizer transformed the SQL. What is the final SQL the optimizer rewrote? I have only one EXISTS sub-query in the SQL above, why are there two sub-plans? What does "hashed Sub-Plan 2" mean? I would appreciate it if anyone share a little knowledge with me.

推荐答案

您有一种误解,认为优化器会重写 SQL 语句.事实并非如此.重写查询是查询重写器的工作,例如用它们的定义替换视图.优化器提出了一系列执行步骤来计算结果.它生成一个计划,而不是一个 SQL 语句.

You have the misconception that the optimizer rewrites the SQL statement. That is not the case. Rewriting the query is the job of the query rewriter, which for example replaces views with their definition. The optimizer comes up with a sequence of execution steps to compute the result. It produces a plan, not an SQL statement.

优化器计划两种选择:要么为找到的每一行执行子计划 1,要么执行子计划 2 一次(注意它独立于 a),根据结果构建一个哈希表并探测在 a 中找到的每一行的哈希值.

The optimizer plans two alternatives: either execute subplan 1 for each row found, or execute subplan 2 once (note that it is independent of a), build a hash table from the result and probe that hash for each row found in a.

在执行时,PostgreSQL 决定使用后一种策略,这就是子计划 1 永远不会执行的原因.

At execution time, PostgreSQL decides to use the latter strategy, that is why subplan 1 is never executed.

这篇关于在 PostgreSQL 中,散列子计划是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆