奇怪的查询计划 [英] Weird query plan
问题描述
大家好!
这是一个奇怪的问题,我遇到了......
我有两个巨大的(每个8000万行)表(a和b),两者都有id作为PK
,而且b引用的是一个FK。
当我尝试运行如下的查询时: br />
从a,b中选择*,其中a.id> = 7901288和a.id = b.id限制1;
查询需要*永远*。
如果我这样做
从a选择*,b其中b.id> = 7901288和a.id = b.id限制1;
然后立即返回。
查询计划在两种情况下看起来都相同:
限制(成本= 0.00..12.51行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1009772807.91行= 80740598宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..375410773.29
行= 80740598宽度= 4)
- >使用a_pkey进行索引扫描(成本= 0.00..6.85行= 1宽度= 4)
....这让我觉得它决定使用b作为外表
这两种情况(显然会让它在第一种情况下吮吸)...... :-(
这种情况发生在7.2.4 ...我有一个具有相同模式的7.3数据库,但
它没有填充数据,所以我无法在7.3上测试它...
我看了7.3但是我的查询计划对我来说看起来更好:
限制(成本= 0.00..4.97行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..45.50行= 333
width = 4)
指数条件:(id> = 7901288)
- >使用a_pkey进行索引扫描(成本= 0.00..4.82行) = 1宽度= 4)
指数条件:(a.id =" outer" .id)
在第二种情况下,
限制(成本= 0.00..4.97行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)
- >使用a_pkey进行索引扫描(成本= 0.00..45.50行= 333
宽度= 4)
索引条件:(id> = 7901288)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)
索引条件:(外部.id = b.id)
>
在第一种情况下...(看起来它像我预期的那样交换它们)...
你知道在7.2之间修复的任何东西。 4和7.3,关于这个问题是什么?
我也注意到在from子句中将a,b更改为b,a不会
会影响任何事情...而且(甚至更奇怪)即使使用明确的
加入也无济于事:
解释从自然连接中选择a.duns b dm其中a.id> = 7901288
限制1;
注意:查询计划:
限制(成本= 0.00..12.78行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1023061272.15行= 80049919宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..380070641.01
行= 81786784宽度= 4)
- >使用a_pkey进行索引扫描(成本= 0.00..6.86行= 1宽度= 4)
:-(
任何想法?
非常感谢!
迪玛
--------- ------------------(广播结束)---------------------------
提示6:您是否搜索了我们的列表档案?
http://archives.postgresql.org
Hi, everybody!
Here is a weird problem, I ran into...
I have two huge (80 million rows each) tables (a and b), with id as a PK
on both of them and also an FK from b referencing a.
When I try to run a query like:
select * from a, b where a.id >= 7901288 and a.id=b.id limit 1;
The query takes *forever*.
If I do
select * from a,b where b.id >= 7901288 and a.id=b.id limit 1;
then it returns right away.
The query plan looks identical in both cases:
Limit (cost=0.00..12.51 rows=1 width=8)
-> Nested Loop (cost=0.00..1009772807.91 rows=80740598 width=8)
-> Index Scan using b_pkey on b (cost=0.00..375410773.29
rows=80740598 width=4)
-> Index Scan using a_pkey on a (cost=0.00..6.85 rows=1 width=4)
.... which makes me think that it decides to use b as the outer table
for both cases (which would obviously make it suck in the first one)... :-(
This happens on 7.2.4... I have a 7.3 database with the same schema, but
it is not populated with data, so I could not test it on 7.3...
I looked at the 7.3''s query plans though, and they look better to me:
Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using b_pkey on b (cost=0.00..45.50 rows=333
width=4)
Index Cond: (id >= 7901288)
-> Index Scan using a_pkey on a (cost=0.00..4.82 rows=1 width=4)
Index Cond: (a.id = "outer".id)
in the second case, and
Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using a_pkey on a (cost=0.00..45.50 rows=333
width=4)
Index Cond: (id >= 7901288)
-> Index Scan using b_pkey on b (cost=0.00..4.82 rows=1 width=4)
Index Cond: ("outer".id = b.id)
in the first case... (looks like it does swap them around as I expected)...
Do you know of anything that got fixed between 7.2.4 and 7.3, related to
this problem?
I also noticed that changing a,b to b,a in the from clause doesn''t
affect anything... and (what''s even more weird) even using an explicit
join doesn''t help:
explain select a.duns from a natural join b dm where a.id >= 7901288
limit 1;
NOTICE: QUERY PLAN:
Limit (cost=0.00..12.78 rows=1 width=8)
-> Nested Loop (cost=0.00..1023061272.15 rows=80049919 width=8)
-> Index Scan using b_pkey on b (cost=0.00..380070641.01
rows=81786784 width=4)
-> Index Scan using a_pkey on a (cost=0.00..6.86 rows=1 width=4)
:-(
Any ideas?
Thanks a lot!
Dima
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
推荐答案
Dmitry Tkach< dm **** @ openratings.com>写道:
Dmitry Tkach <dm****@openratings.com> writes:
查询计划在两种情况下看起来都相同:
限制(成本= 0.00..12.51行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1009772807.91行= 80740598宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..375410773.29
rows = 80740598 width = 4)
- >使用a_pkey进行索引扫描(成本= 0.00..6.85行= 1宽度= 4)
...这让我觉得它决定使用ba两个案件的外表(这显然会让它在第一个案件中吮吸)...... :-(
这就是它所说的,所有对,这对我来说似乎很奇怪。你确定你看到了正确的计划吗?
这种情况发生在7.2.4 ...我有一个具有相同架构的7.3数据库,但是
它没有填充数据,所以我无法在7.3上测试...
The query plan looks identical in both cases: Limit (cost=0.00..12.51 rows=1 width=8)
-> Nested Loop (cost=0.00..1009772807.91 rows=80740598 width=8)
-> Index Scan using b_pkey on b (cost=0.00..375410773.29
rows=80740598 width=4)
-> Index Scan using a_pkey on a (cost=0.00..6.85 rows=1 width=4) ... which makes me think that it decides to use b as the outer table
for both cases (which would obviously make it suck in the first one)... :-(
That''s what it says, all right, which seems odd to me. Are you sure you
looked at the right plans?
This happens on 7.2.4... I have a 7.3 database with the same schema, but
it is not populated with data, so I could not test it on 7.3...
我无法在7.2.4上重现问题。我得到了(使用玩具桌,并且
压制了计划者使用mergejoin的冲动)
lo =#explain select * from a,b其中a.id> = 7901288和a.id = b.id限制1;
注意:查询计划:
限制(成本= 0.00。 .4.97行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)
- >使用a_pkey进行索引扫描(成本= 0.00..45.50行= 333宽度= 4)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)
EXPLAIN
lo =#explain select * from a, b其中b.id> = 7901288和a.id = b.id限制1;
注意:查询计划:
限制(成本= 0.00) ..4.97行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..45.50行= 333宽度= 4)
- >使用a_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)
EXPLAIN
看起来像右边事情。
问候,汤姆小巷
-------------------- -------(广播结束)---------------------------
提示8:解释分析是你的朋友
I could not reproduce a problem on 7.2.4. I get (using toy tables, and
suppressing the planner''s urge to use mergejoin instead)
lo=# explain select * from a, b where a.id >= 7901288 and a.id=b.id limit 1;
NOTICE: QUERY PLAN:
Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using a_pkey on a (cost=0.00..45.50 rows=333 width=4)
-> Index Scan using b_pkey on b (cost=0.00..4.82 rows=1 width=4)
EXPLAIN
lo=# explain select * from a, b where b.id >= 7901288 and a.id=b.id limit 1;
NOTICE: QUERY PLAN:
Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using b_pkey on b (cost=0.00..45.50 rows=333 width=4)
-> Index Scan using a_pkey on a (cost=0.00..4.82 rows=1 width=4)
EXPLAIN
which looks like the right thing.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
嗯...
的确如此。我在7.2.4的几张空桌子上试过它,它确实做了
正确的事情......
另外,我还有另一份副本(不是精确的副本) ,但相同的架构,和原始数据库相似的内容......但是大约小两倍...
我尝试了我的查询,它工作正常也好。
所以,我想这个特定的数据库一定有问题...
我应该看看什么想法?
非常感谢!
迪玛
Tom Lane写道:
Hmmm...
Indeed. I tried it in 7.2.4 on a couple of empty tables, and it does do
the right thing...
Also, I have another copy (not exact copy, but identical schema, and
similar content... but about twice smaller) of the original database...
I tried my query on it, and it works right too.
So, there must be something wrong with that particular database I suppose...
Any ideas what I should look at?
Thanks a lot!
Dima
Tom Lane wrote:
Dmitry Tkach< dm **** @ openratings.com>写道:
Dmitry Tkach <dm****@openratings.com> writes:
查询计划在两种情况下看起来都相同:
The query plan looks identical in both cases:
限制(成本) = 0.00..12.51行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1009772807.91行= 80740598宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..375410773.29
rows = 80740598 width = 4)
- >使用a_pkey进行索引扫描(成本= 0.00..6.85行= 1宽度= 4)
Limit (cost=0.00..12.51 rows=1 width=8)
-> Nested Loop (cost=0.00..1009772807.91 rows=80740598 width=8)
-> Index Scan using b_pkey on b (cost=0.00..375410773.29
rows=80740598 width=4)
-> Index Scan using a_pkey on a (cost=0.00..6.85 rows=1 width=4)
...这让我觉得它决定了对于两种情况都使用b作为外表(这显然会使它在第一种情况下吮吸)...... :-(
... which makes me think that it decides to use b as the outer table
for both cases (which would obviously make it suck in the first one)... :-(
那'这就是它说的,好吧,这对我来说似乎很奇怪。你确定你看了正确的计划吗?
That''s what it says, all right, which seems odd to me. Are you sure you
looked at the right plans?
这发生在7.2.4 ......我有一个具有相同模式的7.3数据库,但是它没有填充数据,所以我无法在7.3上测试它...
This happens on 7.2.4... I have a 7.3 database with the same schema, but
it is not populated with data, so I could not test it on 7.3...
我无法在7.2.4上重现问题。我得到了(使用玩具桌,并且
抑制了计划者使用mergejoin的冲动)
lo =#explain select * from a ,b其中a.id> = 7901288和a.id = b.id限制1;
注意:查询计划:
限制(成本= 0.00..4.97行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)
- >使用a_pkey进行索引扫描(成本= 0.00..45.50行= 333宽度= 4)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)
解析
lo =#explain select * from a,b其中b.id> = 7901288和a.id = b.id限制1;
通知:查询计划:
限制(成本= 0.00..4.97行= 1宽度= 8)
- > ;嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..45.50行= 333宽度= 4)
- >使用a_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)
EXPLAIN
这看起来是正确的。
I could not reproduce a problem on 7.2.4. I get (using toy tables, and
suppressing the planner''s urge to use mergejoin instead)
lo=# explain select * from a, b where a.id >= 7901288 and a.id=b.id limit 1;
NOTICE: QUERY PLAN:
Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using a_pkey on a (cost=0.00..45.50 rows=333 width=4)
-> Index Scan using b_pkey on b (cost=0.00..4.82 rows=1 width=4)
EXPLAIN
lo=# explain select * from a, b where b.id >= 7901288 and a.id=b.id limit 1;
NOTICE: QUERY PLAN:
Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using b_pkey on b (cost=0.00..45.50 rows=333 width=4)
-> Index Scan using a_pkey on a (cost=0.00..4.82 rows=1 width=4)
EXPLAIN
which looks like the right thing.
regards, tom lane
------------------------ ---(广播结束)---------------------------
提示2:你可以下车使用取消注册命令一次性列出所有列表
(发送取消注册YourEmailAddressHere到 ma **** ***@postgresql.org )
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)
Dmitry Tkach< dm **** @ openratings.com>写道:
Dmitry Tkach <dm****@openratings.com> writes:
另外,我还有另一个副本(不是完全复制,但是相同的模式,和类似的内容......但大约小两倍)原始数据库...
我尝试了我的查询,它也正常。
所以,我想这个特定的数据库一定有问题......
Also, I have another copy (not exact copy, but identical schema, and
similar content... but about twice smaller) of the original database...
I tried my query on it, and it works right too. So, there must be something wrong with that particular database I suppose...
嗯。看看每个
案例中id列的pg_stats统计信息。那些行为不端的桌子可能会出现重击吗?b
不知怎的?我想知道,例如,计划者是否打折了> =
条件,因为它认为它会匹配所有行。
问候,汤姆车道
---------------------------(播出结束)------ ---------------------
提示1:订阅和取消订阅命令转到 ma ******* @ postgresql.org
Hmm. Take a look at the pg_stats statistics for the id columns in each
case. Could the ones for the misbehaving tables be out of whack
somehow? I''m wondering for example if the planner discounted the >=
condition because it thought it would match all the rows.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org
这篇关于奇怪的查询计划的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!