奇怪的查询计划 [英] Weird query plan

查看:56
本文介绍了奇怪的查询计划的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好!


这是一个奇怪的问题,我遇到了......

我有两个巨大的(每个8000万行)表(a和b),两者都有id作为PK

,而且b引用的是一个FK。

当我尝试运行如下的查询时: br />

从a,b中选择*,其中a.id> = 7901288和a.id = b.id限制1;


查询需要*永远*。

如果我这样做

从a选择*,b其中b.id> = 7901288和a.id = b.id限制1;


然后立即返回。


查询计划在两种情况下看起来都相同:


限制(成本= 0.00..12.51行= 1宽度= 8)

- >嵌套循环(成本= 0.00..1009772807.91行= 80740598宽度= 8)

- >使用b上的b_pkey进行索引扫描(成本= 0.00..375410773.29

行= 80740598宽度= 4)

- >使用a_pkey进行索引扫描(成本= 0.00..6.85行= 1宽度= 4)


....这让我觉得它决定使用b作为外表

这两种情况(显然会让它在第一种情况下吮吸)...... :-(


这种情况​​发生在7.2.4 ...我有一个具有相同模式的7.3数据库,但

它没有填充数据,所以我无法在7.3上测试它...

我看了7.3但是我的查询计划对我来说看起来更好:


限制(成本= 0.00..4.97行= 1宽度= 8)

- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)

- >使用b上的b_pkey进行索引扫描(成本= 0.00..45.50行= 333

width = 4)

指数条件:(id> = 7901288)

- >使用a_pkey进行索引扫描(成本= 0.00..4.82行) = 1宽度= 4)

指数条件:(a.id =" outer" .id)


在第二种情况下,


限制(成本= 0.00..4.97行= 1宽度= 8)

- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)

- >使用a_pkey进行索引扫描(成本= 0.00..45.50行= 333

宽度= 4)

索引条件:(id> = 7901288)

- >使用b上的b_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)

索引条件:(外部.id = b.id)

在第一种情况下...(看起来它像我预期的那样交换它们)...


你知道在7.2之间修复的任何东西。 4和7.3,关于这个问题是什么?


我也注意到在from子句中将a,b更改为b,a不会

会影响任何事情...而且(甚至更奇怪)即使使用明确的

加入也无济于事:

解释从自然连接中选择a.duns b dm其中a.id> = 7901288

限制1;

注意:查询计划:


限制(成本= 0.00..12.78行= 1宽度= 8)

- >嵌套循环(成本= 0.00..1023061272.15行= 80049919宽度= 8)

- >使用b上的b_pkey进行索引扫描(成本= 0.00..380070641.01

行= 81786784宽度= 4)

- >使用a_pkey进行索引扫描(成本= 0.00..6.86行= 1宽度= 4)


:-(


任何想法?


非常感谢!


迪玛


--------- ------------------(广播结束)---------------------------

提示6:您是否搜索了我们的列表档案?

http://archives.postgresql.org

Hi, everybody!

Here is a weird problem, I ran into...
I have two huge (80 million rows each) tables (a and b), with id as a PK
on both of them and also an FK from b referencing a.
When I try to run a query like:

select * from a, b where a.id >= 7901288 and a.id=b.id limit 1;

The query takes *forever*.
If I do
select * from a,b where b.id >= 7901288 and a.id=b.id limit 1;

then it returns right away.

The query plan looks identical in both cases:

Limit (cost=0.00..12.51 rows=1 width=8)
-> Nested Loop (cost=0.00..1009772807.91 rows=80740598 width=8)
-> Index Scan using b_pkey on b (cost=0.00..375410773.29
rows=80740598 width=4)
-> Index Scan using a_pkey on a (cost=0.00..6.85 rows=1 width=4)

.... which makes me think that it decides to use b as the outer table
for both cases (which would obviously make it suck in the first one)... :-(

This happens on 7.2.4... I have a 7.3 database with the same schema, but
it is not populated with data, so I could not test it on 7.3...
I looked at the 7.3''s query plans though, and they look better to me:

Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using b_pkey on b (cost=0.00..45.50 rows=333
width=4)
Index Cond: (id >= 7901288)
-> Index Scan using a_pkey on a (cost=0.00..4.82 rows=1 width=4)
Index Cond: (a.id = "outer".id)

in the second case, and

Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using a_pkey on a (cost=0.00..45.50 rows=333
width=4)
Index Cond: (id >= 7901288)
-> Index Scan using b_pkey on b (cost=0.00..4.82 rows=1 width=4)
Index Cond: ("outer".id = b.id)

in the first case... (looks like it does swap them around as I expected)...

Do you know of anything that got fixed between 7.2.4 and 7.3, related to
this problem?

I also noticed that changing a,b to b,a in the from clause doesn''t
affect anything... and (what''s even more weird) even using an explicit
join doesn''t help:
explain select a.duns from a natural join b dm where a.id >= 7901288
limit 1;
NOTICE: QUERY PLAN:

Limit (cost=0.00..12.78 rows=1 width=8)
-> Nested Loop (cost=0.00..1023061272.15 rows=80049919 width=8)
-> Index Scan using b_pkey on b (cost=0.00..380070641.01
rows=81786784 width=4)
-> Index Scan using a_pkey on a (cost=0.00..6.86 rows=1 width=4)

:-(

Any ideas?

Thanks a lot!

Dima

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

推荐答案

Dmitry Tkach< dm **** @ openratings.com>写道:
Dmitry Tkach <dm****@openratings.com> writes:
查询计划在两种情况下看起来都相同:
限制(成本= 0.00..12.51行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1009772807.91行= 80740598宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..375410773.29
rows = 80740598 width = 4)
- >使用a_pkey进行索引扫描(成本= 0.00..6.85行= 1宽度= 4)
...这让我觉得它决定使用ba两个案件的外表(这显然会让它在第一个案件中吮吸)...... :-(


这就是它所说的,所有对,这对我来说似乎很奇怪。你确定你看到了正确的计划吗?

这种情况​​发生在7.2.4 ...我有一个具有相同架构的7.3数据库,但是
它没有填充数据,所以我无法在7.3上测试...
The query plan looks identical in both cases: Limit (cost=0.00..12.51 rows=1 width=8)
-> Nested Loop (cost=0.00..1009772807.91 rows=80740598 width=8)
-> Index Scan using b_pkey on b (cost=0.00..375410773.29
rows=80740598 width=4)
-> Index Scan using a_pkey on a (cost=0.00..6.85 rows=1 width=4) ... which makes me think that it decides to use b as the outer table
for both cases (which would obviously make it suck in the first one)... :-(
That''s what it says, all right, which seems odd to me. Are you sure you
looked at the right plans?
This happens on 7.2.4... I have a 7.3 database with the same schema, but
it is not populated with data, so I could not test it on 7.3...




我无法在7.2.4上重现问题。我得到了(使用玩具桌,并且

压制了计划者使用mergejoin的冲动)


lo =#explain select * from a,b其中a.id> = 7901288和a.id = b.id限制1;

注意:查询计划:


限制(成本= 0.00。 .4.97行= 1宽度= 8)

- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)

- >使用a_pkey进行索引扫描(成本= 0.00..45.50行= 333宽度= 4)

- >使用b上的b_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)


EXPLAIN

lo =#explain select * from a, b其中b.id> = 7901288和a.id = b.id限制1;

注意:查询计划:


限制(成本= 0.00) ..4.97行= 1宽度= 8)

- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)

- >使用b上的b_pkey进行索引扫描(成本= 0.00..45.50行= 333宽度= 4)

- >使用a_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)

EXPLAIN


看起来像右边事情。


问候,汤姆小巷


-------------------- -------(广播结束)---------------------------

提示8:解释分析是你的朋友



I could not reproduce a problem on 7.2.4. I get (using toy tables, and
suppressing the planner''s urge to use mergejoin instead)

lo=# explain select * from a, b where a.id >= 7901288 and a.id=b.id limit 1;
NOTICE: QUERY PLAN:

Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using a_pkey on a (cost=0.00..45.50 rows=333 width=4)
-> Index Scan using b_pkey on b (cost=0.00..4.82 rows=1 width=4)

EXPLAIN
lo=# explain select * from a, b where b.id >= 7901288 and a.id=b.id limit 1;
NOTICE: QUERY PLAN:

Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using b_pkey on b (cost=0.00..45.50 rows=333 width=4)
-> Index Scan using a_pkey on a (cost=0.00..4.82 rows=1 width=4)

EXPLAIN

which looks like the right thing.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


嗯...

的确如此。我在7.2.4的几张空桌子上试过它,它确实做了

正确的事情......

另外,我还有另一份副本(不是精确的副本) ,但相同的架构,和原始数据库相似的内容......但是大约小两倍...

我尝试了我的查询,它工作正常也好。

所以,我想这个特定的数据库一定有问题...


我应该看看什么想法?


非常感谢!


迪玛


Tom Lane写道:
Hmmm...
Indeed. I tried it in 7.2.4 on a couple of empty tables, and it does do
the right thing...
Also, I have another copy (not exact copy, but identical schema, and
similar content... but about twice smaller) of the original database...
I tried my query on it, and it works right too.
So, there must be something wrong with that particular database I suppose...

Any ideas what I should look at?

Thanks a lot!

Dima

Tom Lane wrote:
Dmitry Tkach< dm **** @ openratings.com>写道:

Dmitry Tkach <dm****@openratings.com> writes:

查询计划在两种情况下看起来都相同:
The query plan looks identical in both cases:


限制(成本) = 0.00..12.51行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1009772807.91行= 80740598宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..375410773.29
rows = 80740598 width = 4)
- >使用a_pkey进行索引扫描(成本= 0.00..6.85行= 1宽度= 4)
Limit (cost=0.00..12.51 rows=1 width=8)
-> Nested Loop (cost=0.00..1009772807.91 rows=80740598 width=8)
-> Index Scan using b_pkey on b (cost=0.00..375410773.29
rows=80740598 width=4)
-> Index Scan using a_pkey on a (cost=0.00..6.85 rows=1 width=4)


...这让我觉得它决定了对于两种情况都使用b作为外表(这显然会使它在第一种情况下吮吸)...... :-(
... which makes me think that it decides to use b as the outer table
for both cases (which would obviously make it suck in the first one)... :-(



那'这就是它说的,好吧,这对我来说似乎很奇怪。你确定你看了正确的计划吗?



That''s what it says, all right, which seems odd to me. Are you sure you
looked at the right plans?

这发生在7.2.4 ......我有一个具有相同模式的7.3数据库,但是它没有填充数据,所以我无法在7.3上测试它...
This happens on 7.2.4... I have a 7.3 database with the same schema, but
it is not populated with data, so I could not test it on 7.3...



我无法在7.2.4上重现问题。我得到了(使用玩具桌,并且
抑制了计划者使用mergejoin的冲动)

lo =#explain select * from a ,b其中a.id> = 7901288和a.id = b.id限制1;
注意:查询计划:

限制(成本= 0.00..4.97行= 1宽度= 8)
- >嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)
- >使用a_pkey进行索引扫描(成本= 0.00..45.50行= 333宽度= 4)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)

解析
lo =#explain select * from a,b其中b.id> = 7901288和a.id = b.id限制1;
通知:查询计划:

限制(成本= 0.00..4.97行= 1宽度= 8)
- > ;嵌套循环(成本= 0.00..1657.34行= 333宽度= 8)
- >使用b上的b_pkey进行索引扫描(成本= 0.00..45.50行= 333宽度= 4)
- >使用a_pkey进行索引扫描(成本= 0.00..4.82行= 1宽度= 4)

EXPLAIN

这看起来是正确的。



I could not reproduce a problem on 7.2.4. I get (using toy tables, and
suppressing the planner''s urge to use mergejoin instead)

lo=# explain select * from a, b where a.id >= 7901288 and a.id=b.id limit 1;
NOTICE: QUERY PLAN:

Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using a_pkey on a (cost=0.00..45.50 rows=333 width=4)
-> Index Scan using b_pkey on b (cost=0.00..4.82 rows=1 width=4)

EXPLAIN
lo=# explain select * from a, b where b.id >= 7901288 and a.id=b.id limit 1;
NOTICE: QUERY PLAN:

Limit (cost=0.00..4.97 rows=1 width=8)
-> Nested Loop (cost=0.00..1657.34 rows=333 width=8)
-> Index Scan using b_pkey on b (cost=0.00..45.50 rows=333 width=4)
-> Index Scan using a_pkey on a (cost=0.00..4.82 rows=1 width=4)

EXPLAIN

which looks like the right thing.

regards, tom lane




------------------------ ---(广播结束)---------------------------

提示2:你可以下车使用取消注册命令一次性列出所有列表

(发送取消注册YourEmailAddressHere到 ma **** ***@postgresql.org



---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)


Dmitry Tkach< dm **** @ openratings.com>写道:
Dmitry Tkach <dm****@openratings.com> writes:
另外,我还有另一个副本(不是完全复制,但是相同的模式,和类似的内容......但大约小两倍)原始数据库...
我尝试了我的查询,它也正常。
所以,我想这个特定的数据库一定有问题......
Also, I have another copy (not exact copy, but identical schema, and
similar content... but about twice smaller) of the original database...
I tried my query on it, and it works right too. So, there must be something wrong with that particular database I suppose...




嗯。看看每个

案例中id列的pg_stats统计信息。那些行为不端的桌子可能会出现重击吗?b
不知怎的?我想知道,例如,计划者是否打折了> =

条件,因为它认为它会匹配所有行。


问候,汤姆车道


---------------------------(播出结束)------ ---------------------

提示1:订阅和取消订阅命令转到 ma ******* @ postgresql.org



Hmm. Take a look at the pg_stats statistics for the id columns in each
case. Could the ones for the misbehaving tables be out of whack
somehow? I''m wondering for example if the planner discounted the >=
condition because it thought it would match all the rows.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org


这篇关于奇怪的查询计划的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆