PostgreSQL-使用enable_nestloop = false可以使查询运行得更快。为什么计划者没有做正确的事? [英] Postgresql - Query running a lot faster with enable_nestloop=false. Why is the planner not doing the right thing?

查看:887
本文介绍了PostgreSQL-使用enable_nestloop = false可以使查询运行得更快。为什么计划者没有做正确的事?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我使用默认的enable_nestloop = true和enable_nestloop = false(〜10秒)运行该查询时,我的查询运行得慢得多(〜5分钟)。

I have a query that runs a lot slower (~5 minutes) when I run it with the default enable_nestloop=true and enable_nestloop=false (~10 secs).

解释两种情况的分析结果:

Explain analyse result for both cases:

机器A nestloop = true-http://explain.depesz.com/s/nkj0 (约5分钟)
Machine A nestloop = false-http://explain.depesz.com/s/wBM (约10秒)

Machine A nestloop=true - http://explain.depesz.com/s/nkj0 (~5 minutes) Machine A nestloop=false - http://explain.depesz.com/s/wBM (~10 secs)

在另一台稍微慢一点的机器上,复制数据库并保留默认的enable_nestloop = true大约需要20秒。

On a different slightly slower machine, copying the database over and leaving the default enable_nestloop=true it takes ~20 secs.

Machine B nestloop = true-(〜20secs )

Machine B nestloop=true - (~ 20secs)

对于上述所有情况,我都确保在运行查询之前进行了ANALYZE。没有其他查询并行运行。

For all the cases above I ensured that I did an ANALYZE before running the queries. There were no other queries running in parallel.

两台机器都在运行Postgres 8.4。机器A运行32位Ubuntu 10.04,而机器B运行32位Ubuntu 8.04。

Both machines are running Postgres 8.4. Machine A is running Ubuntu 10.04 32 bit while Machine B is running Ubuntu 8.04 32 bit.

此处提供了实际查询。这是一个具有许多联接的报告查询,因为该数据库主要用于事务处理。

The actual query is available here . It is a reporting query with many joins as the database is mainly used for transaction processing.


  1. 无需采取任何类似的措施物化视图我该怎么做才能使计划者通过设置enable_nestloop = false来实现我的目标?

  1. Without resorting to putting in something like materialized views what can I do to make the planner do what I achieved by setting enable_nestloop=false ?

从我所做的研究看来,计划者选择看似不理想的查询的原因是由于估算的行与实际的行之间存在巨大差异。我怎样才能使这个数字更接近?

From the research I have done it seems to be that the reason the planner is choosing the seemingly unoptimal query is because of the huge difference between the estimated and actual rows. How can I get this figure closer ?

如果我要重写查询,应该更改什么?

If I should rewrite the query, what should I change ?

为什么计划程序似乎对机器B做正确的事情。两台机器上我应该比较什么?

Why is it that the planner seems to be doing the right thing for Machine B. What should I be comparing in both the machines ?


推荐答案

原来,重写查询是最好的解决方法。该查询以严重依赖于左联接并且具有许多联接的方式编写。通过使用我对查询要连接的表中数据的连接性质的了解,我将其弄平并减少了左连接。我认为经验法则是,如果计划者得出的实际估算值很差,那么可能会有更好的方式编写查询。

Turns out rewriting the query was the best fix. The query was written in a way that it relied heavily on left joins and had many joins. I flattened it out and reduced the left joins by using my knowledge of the join nature of the data in the tables the query was joining. I guess the rule of thumb is if the planner is coming out with real crappy estimates, there might be a better way of writing the query.

这篇关于PostgreSQL-使用enable_nestloop = false可以使查询运行得更快。为什么计划者没有做正确的事?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆