Postgresql join_collapse_limit和查询计划时间 [英] Postgresql join_collapse_limit and time for query planning

查看:175
本文介绍了Postgresql join_collapse_limit和查询计划时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚发现join_collapse_limit一直在阻止PostgreSQL计划者找到更好的联接顺序.就我而言,将限制增加到10(从默认值8)可以使计划者将搜索时间从大约30秒缩短到大约1毫秒,这是可以接受的.

I just discovered join_collapse_limit has been preventing the PostgreSQL planner from finding a much better join order. In my case, increasing the limit to 10 (from the default of 8) allowed the planner to improve search time from ~30 secs to ~1 ms, which is much more acceptable.

文档建议将此值设置为太高"可能会导致较长的计划时间,但甚至没有提供关于各种值的计划步骤可能需要多长时间的经验法则".我知道一般的问题在时间上是成指数的,但是我无法找到一种确定实际计划时间的方法,除非它只是运行ANALYZE SELECT ...所花费的时间.如果真是这样,我相信现代计算机的默认值8会很低,因为我无法发现8到10之间的计划速度没有差异.

The documentation suggests that setting this "too high" could result in long planning times, but does not provide even a "rule of thumb" about how long the planning step might be for various values. I understand the general problem is exponential in time, but I cannot find a way to determine the actual planning time unless it it simply the time it takes to run ANALYZE SELECT .... If that is the case, I believe the default of 8 is quite low for modern computers since I can detect no difference in the speed of planning between 8 and 10.

问题:

1)一个如何衡量计划时间?

1) How can one measure planning time?

2)大约join_collapse_limit可以达到多少,并且仍希望计划花费不到几百毫秒的时间?

2) Approximately, how high can join_collapse_limit be and still expect planning to take less than a couple hundred milliseconds?

推荐答案

1)一个如何衡量计划时间?

1) How can one measure planning time?

新的PostgreSQL 9.4版本(在撰写本文时尚未发布)将为EXPLAINEXPLAIN ANALYZE添加计划时间,因此您将能够使用它们.

The new 9.4 version of PostgreSQL (not yet released at the time of this writing) is going to add planning time into EXPLAIN and EXPLAIN ANALYZE, and so you'll be able to use those.

对于较早的版本,您的假设是正确的,确定计划时间的更好方法是执行简单的EXPLAIN(否ANALYZE)并检查所花费的时间,在psql中,您可以通过启用\timing(我通常在~/.psqlrc上执行此操作.)

For older versions, your assumption is right, the better way to determine planning time is by executing a simple EXPLAIN (no ANALYZE) and checking the time it took, in psql you can do it by enabling the \timing (I generally do that at ~/.psqlrc).

2)大约,join_collapse_limit可以达到多少并且仍然可以期望 计划花费不到几百毫秒的时间?

2) Approximately, how high can join_collapse_limit be and still expect planning to take less than a couple hundred milliseconds?

PostgreSQL黑客团队已经讨论过将其提高到更大的价值.但是看起来他们不能保证这对所有情况都有利.

The PostgreSQL hackers team already discussed about raising it to bigger values. But looks like they couldn't guarantee that it would be good for all cases.

问题在于,为N表找到最佳连接顺序的计划采用了O(N!)(阶乘)方法.因此,加薪的数字非常高,您可以通过以下查询简单地看到它:

The problem is that the planning to find the best join order for N tables takes an O(N!) (factorial) approach. And so, the numbers the raise is very high, you can simple see that with the following query:

$ SELECT i, (i)! AS num_comparisons FROM generate_series(8, 20) i;
 i  |   num_comparisons   
----+---------------------
  8 |               40320
  9 |              362880
 10 |             3628800
 11 |            39916800
 12 |           479001600
 13 |          6227020800
 14 |         87178291200
 15 |       1307674368000
 16 |      20922789888000
 17 |     355687428096000
 18 |    6402373705728000
 19 |  121645100408832000
 20 | 2432902008176640000
(13 rows)

如您所见,默认情况下,我们最多只能进行4万次比较,而您建议的10次比较会达到3M,这对于现代计算机来说仍然不是很多,但是下一个值开始变得太大,它只是增加得太快,所以20简直太疯狂了(21!甚至不适合64位整数).

As you can see, at the default of 8 we do at most about 40K comparisons, the 10 you proposed makes it go to 3M, which is still not very much for modern computers, but the next values start becoming too large, it just increase too fast, the 20 is just insane (21! doesn't even fits a 64 bits integer).

当然,有时您可以将其设置为更大的值(例如16),(理论上)可以进行约20万亿次比较,并且仍然具有非常好的计划时间,这是因为PostgreSQL在计划时做了一些修改,无需始终检查所有订单,但是假设情况总是如此,并且将如此高的值设为默认值,对我来说似乎不是一个好方法.将来可能会出现一些意外查询,使它只能检查所有订单,然后只有一个查询使服务器停机.

Of course, sometimes you can set it to larger values like 16, that could (in theory) make up to about 20 trillions comparisons, and still have very good planing time, that is because PostgreSQL cut some paths while planning and don't need to always check all orders, but assuming that it'll always be the case and make such high values the default, doesn't look like a good approach to me. There may be some unexpected query in the future that make it goes to checking all the orders and then you have one only query that put your server down.

根据我的经验,我认为在好的服务器上的任何安装都将10作为默认值,其中有些甚至使用12.建议您将其设置为10(如果需要),有时可以尝试设置更高(我不会超过12岁)并继续监控(密切关注)以了解其表现.

In my experience, I assume the 10 as a default value on any installation in good servers, some of them I even use 12. I recommend you to set it to 10, if you like, and at some times, try setting it higher (I wouldn't go beyond 12) and keep monitoring (closely) to see how it behaves.

这篇关于Postgresql join_collapse_limit和查询计划时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆