(SELECT)查询何时计划进行? [英] When are (SELECT) queries planned?

查看:127
本文介绍了(SELECT)查询何时计划进行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在PostgreSQL中,何时计划(SELECT)查询?

In PostgreSQL, when are (SELECT) queries planned?

是吗:


  1. 在准备语句时,或者

  2. 在开始处理SELECT时,或者

  3. 其他东西

我问的原因是有一个Stackoverflow问题:

The reason I ask is that there is a Stackoverflow question: same query, two different ways, vastly different performance

很多人似乎在思考该查询的计划是不同的,因为在一种情况下该查询包含字符串文字('foo'),而在另一种情况下它是一个占位符()。

A lot of people seem to be thinking that the query is planned differently because in one case the query contains a string literal ('foo') and in another case it's a placeholder (?).

现在我的想法是这是一条红鲱鱼,因为查询不是在语句准备时计划的,但实际上是

Now my thinking is that this is a red herring, because the query isn't planned at statement-prepare time, but is actually planned at SELECT time.

因此,例如,我可以准备一个带占位符的语句,然后以不同的方式多次运行查询ent绑定值,并且将针对每个不同的绑定值运行查询计划程序。

So, say, I could prepare a statement with a placeholder, then run the query multiple times with different bound values, and the query planner will be run for each different bound value.

我怀疑上面链接的问题可归结为该值的PostgreSQL数据类型,对于 'foo'文字是一个字符串,但是在占位符的情况下,该类型无法被划分,因此作为某种奇怪的类型进入查询计划器,它可以不能为此制定有效的计划。在这种情况下,问题不在于查询的计划不同,因为该值本身是占位符(在语句准备时)本身,而是该值作为不同的PostgreSQL传递到查询中类型,并认为是影响查询计划者的因素。要解决此问题,只需将占位符与适当的显式类型声明绑定即可。

I suspect that the question linked above boils down to the PostgreSQL data type of the value, which in the case of a 'foo' literal is known to be a string, but in the case of a placeholder, the type can't be divined, so is coming through to the query planner as some strange type, which it can't create an efficient plan for. In which case, the issue is not that the query is being planned differently because the value is a placeholder (at statement preparation time) per se but that the value is coming through to the query as a different PostgreSQL type, and that is what is influencing the query planner. To fix this would simply be a matter of binding the placeholder with an appropriate explicit type declaration.

推荐答案

我不能谈论客户端侧的Perl接口本身,但我可以在PostgreSQL服务器端进行一些说明。

I cannot talk about the client-side Perl interface itself but I can shed some light on the PostgreSQL server side.

PostgreSQL有准备好的语句和未准备好的语句。未经准备的语句将立即被解析,计划和执行。他们还支持参数替换。在普通的 psql 外壳上,您可以显示其查询计划,如下所示:

PostgreSQL has prepared statements and unprepared statements. Unprepared statements are parsed, planned and executed immediately. They also do not support parameter substitution. On a plain psql shell you can show their query plan like this:

tmpdb> explain select * from sometable where flag = true;

另一方面,有准备好的语句:通常对它们(请参阅下面的例外)进行解析并第一步计划,第二步执行。它们可以用不同的参数重新执行几次,因为它们 do 支持参数替换。 psql 中的等效项是:

On the other hand there are prepared statements: They are usually (see "exception" below) parsed and planned in one step and executed in a second step. They can be re-executed several times with different parameters, because they do support parameter substitution. The equivalent in psql is this:

tmpdb> prepare foo as select * from sometable where flag = $1;
tmpdb> explain execute foo(true);

您可能会看到,该计划与未准备好的陈述中的计划不同,因为计划确实需要按照准备阶段。 html rel = nofollow noreferrer> PREPARE

You may see, that the plan is different from the plan in the unprepared statement, because planning did take place already in the prepare phase as described in the doc for PREPARE:


在执行PREPARE语句时,指定的语句为解析,重写,和计划。随后随后发出EXECUTE命令时,只需执行准备好的语句。因此,解析,重写和计划阶段仅执行一次,而不是每次执行语句。

When the PREPARE statement is executed, the specified statement is parsed, rewritten, and planned. When an EXECUTE command is subsequently issued, the prepared statement need only be executed. Thus, the parsing, rewriting, and planning stages are only performed once, instead of every time the statement is executed.

,该计划针对替换参数进行了 NOT 优化:在第一个示例中,可能为 flag 使用索引,因为PostgreSQL知道在100万个条目中只有十个具有值 true 。当PostgreSQL使用预处理语句时,这种推理是不可能的。在这种情况下,将创建一个计划,该计划将尽可能适用于所有可能的参数值。该可能排除了所提到的索引,因为通过随机访问(由于索引)来获取完整表的更好部分要比普通顺序扫描要慢。 PREPARE 文档确认了这一点:

This also means, that the plan is NOT optimized for the substituted parameters: In the first examples might use an index for flag because PostgreSQL knows that within a million entries only ten have the value true. This reasoning is impossible when PostgreSQL uses a prepared statement. In that case a plan is created which will work for all possible parameter values as good as possible. This might exclude the mentioned index because fetching the better part of the complete table via random access (due to the index) is slower than a plain sequential scan. The PREPARE doc confirms this:


在某些情况下,为准备好的语句生成的查询计划将劣于如果语句已提交并正常执行的情况下选择的查询计划。这是因为当计划语句时,计划者尝试确定最佳查询计划时,语句中指定的任何参数的实际值都不可用。 PostgreSQL收集有关表中数据分布的统计信息,并且可以在语句中使用常量值来猜测执行该语句的可能结果。由于在计划带有参数的准备好的语句时此数据不可用,因此所选计划可能不是最佳选择。

In some situations, the query plan produced for a prepared statement will be inferior to the query plan that would have been chosen if the statement had been submitted and executed normally. This is because when the statement is planned and the planner attempts to determine the optimal query plan, the actual values of any parameters specified in the statement are unavailable. PostgreSQL collects statistics on the distribution of data in the table, and can use constant values in a statement to make guesses about the likely result of executing the statement. Since this data is unavailable when planning prepared statements with parameters, the chosen plan might be suboptimal.

BTW-关于计划 PREPARE 文档也有话要说:

BTW - Regarding plan caching the PREPARE doc also has something to say:


准备的语句仅在当前数据库会话期间持续。会话结束时,准备好的语句会被遗忘,因此必须在重新使用之前重新创建它。

Prepared statements only last for the duration of the current database session. When the session ends, the prepared statement is forgotten, so it must be recreated before being used again.

此外,也没有自动计划

EXCEPTION :我已经提到通常。所示的 psql 示例并不是像Perl DBI这样的客户端适配器真正使用的东西。它使用某些协议。这里的术语简单查询对应于 psql 中的未准备好的查询,术语 扩展查询对应于准备的查询,但有一个例外:(一个)之间有区别未命名的陈述和(可能有多个)命名的陈述。关于命名语句, doc 说:

EXCEPTION: I have mentioned "usually". The shown psql examples are not the stuff a client adapter like Perl DBI really uses. It uses a certain protocol. Here the term "simple query" corresponds to the "unprepared query" in psql, the term "extended query" corresponds to "prepared query" with one exception: There is a distinction between (one) "unnamed statement" and (possibly multiple) "named statements". Regarding named statements the doc says:


也可以使用PREPARE和EXECUTE在SQL命令级别创建和访问命名的准备好的语句。

Named prepared statements can also be created and accessed at the SQL command level, using PREPARE and EXECUTE.

,并且:


查询已命名准备语句的计划对象在处理解析消息时发生。

Query planning for named prepared-statement objects occurs when the Parse message is processed.

因此,在这种情况下,如上面针对 PREPARE -没有新内容。

So in this case planning is done without parameters as described above for PREPARE - nothing new.

提到的异常是未命名的语句。该文档说:

The mentioned exception is the "unnamed statement". The doc says:


在解析处理期间同样计划了未命名的准备好的语句如果解析消息未定义任何参数 。但是,如果有参数,则每次提供绑定参数时都会进行查询计划。这样,计划者就可以利用每个Bind消息提供的参数的实际值,而不是使用一般估计。

The unnamed prepared statement is likewise planned during Parse processing if the Parse message defines no parameters. But if there are parameters, query planning occurs every time Bind parameters are supplied. This allows the planner to make use of the actual values of the parameters provided by each Bind message, rather than use generic estimates.

这是有好处的:尽管未命名的语句是准备好的(即可以进行参数替换),但它也可以使查询计划适应实际的参数。

And here is the benefit: Although the unnamed statement is "prepared" (i.e. can have parameter substitution), it also can adapt the query plan to the actual parameters.

BTW:在过去的PostgreSQL服务器发行版中,对未命名语句的确切处理已发生了多次更改。

BTW: The exact handling of the unnamed statement has changed several times in the past releases of the PostgreSQL server. You can lookup the old docs for details if you really want.

Rationale-Perl /任何客户端

像Perl这样的 client 如何使用协议是完全不同的问题。诸如Java的JDBC驱动程序之类的一些客户端基本上会说:即使程序员使用一条准备好的语句,前五个(或大约)执行在内部也会映射到一个简单查询(即实际上没有准备好),然后驱动程序切换为命名语句。

How a client like Perl uses the protocol is a completely different question. Some clients like the JDBC driver for Java basically say: Even if the programmer uses a prepared statement, the first five (or so) executions are internally mapped to a "simple query" (i.e. effectively unprepared), after that the driver switches to "named statement".

因此客户有以下选择:


  • 强制(重新)计划每个

  • Plan 一次,然后使用扩展查询协议和命名语句(计划)多次执行可能很糟糕,因为计划是在没有参数的情况下完成的。)

  • Parse 一次,使用扩展查询协议为每次执行(使用当前PostgreSQL版本)进行计划以及未命名的语句并遵循更多内容(在 parse消息中提供一些参数)

  • 使用完全不同的技巧,例如JDBC驱动程序。

  • Force (re)planning each time by using the "simple query" protocol.
  • Plan once, execute multiple times by using the "extended query" protocol and the "named statement" (plan might be bad because planning is done without parameters).
  • Parse once, plan for each execution (with current PostgreSQL version) by using the "extended query" protocol and the "unnamed statement" and obeying some more things (provide some params during "parse" message)
  • Play completely different tricks like the JDBC driver.

Perl当前做什么:我不知道。但是提到的红鲱鱼并不是不太可能。

What Perl does currently: I don't know. But the mentioned "red herring" is not very unlikely.

这篇关于(SELECT)查询何时计划进行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆