查询性能 - 请帮助 [英] Query performance - Please help

查看:71
本文介绍了查询性能 - 请帮助的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我执行一个查询(针对DB2 for iSeries),它的通用形式

如下所示。这个查询运行得很好,执行几个




SELECT V.FIELD01,V.FIELD02,V.FIELD03,V.FIELD04,V .FIELD05,

V.FIELD06,V.FIELD07,V.FIELD08,V.FIELD09,V.FIELD10,

V.FIELD11,V.FIELD12,V.FIELD13

来自SCHEMA1.VIEW1 V

WHERE V.FIELD2 BETWEEN'03 / 10/2005''和'03 / 10/2006''

AND(V.FIELD4 ='''103''或V.FIELD4 =''100'')

AND(V.FIELD8 = 120 OR V.FIELD15 = 120)

AND V.FIELD16<> ''4''

ORDER BY V.FIELD05,V.FIELD12


现在,我修改了查询,因为某些商业原因要求我

这样做。我在

特定(单个)列上执行原始视图与某些表的连接,并在WHERE

子句中添加OR子句。修改后的查询如下所示(这是我的问题!)

需要大约40秒才能运行!


SELECT V.FIELD01,V.FIELD02, V.FIELD03,V.FIELD04,V.FIELD05,

V.FIELD06,V.FIELD07,V.FIELD08,V.FIELD09,V.FIELD10,

V. FIELD11,V.FIELD12,V.FIELD13

FROM SCHEMA1.VIEW1 V,SCHEMA2.TABLE1 T //< - (1)表格添加

WHERE T.FIELD1 = 1和V.FIELD14 = T.FIELD2 //< - (2)加入条件

AND V.FIELD2 BETWEEN''03 / 10/2005''和'03 / 10 / 2006''

AND(V.FIELD4 ='''103''或V.FIELD4 ='''100'')

AND(V.FIELD8 = 120 OR V .FIELD15 = 120或T.FIELD3 = 120)

//(3)在这里添加了一个OR语句 - |

AND V.FIELD16<> ''4''

ORDER BY V.FIELD05,V.FIELD12


我想知道为什么这个,乍一看无害的加入,导致这样的

性能下降。

我已经通过iSeries导航器附带的Visual Explain工具运行了两个查询。第二种情况的瓶颈是在扫描元数据表的结果与表T的结果之间执行的散列

连接。估计处理时间 ;对于那个

散列连接在40的范围内。


Visual Explain外观中的图形可以如下描述:

(1)索引扫描 - 键定位应用于视图V的两个组件

表(比如T1和T2).A嵌套循环连接适用于

结果

(2)" Table Scan"应用于剩余的组件表(比如T3)

的V. ATemporary Hash Table生成

[请注意,这个处理的V与

原始,更简单,查询中执行的处理相同)

(3 )表扫描,平行适用于表格T,现在已经进入图片中了
。 临时哈希表生成。


然后根据(1),(2),(3)

的结果产生一个大散列连接需要41秒!我怎么能避免这种情况?


很多,非常感谢提前!!!


Panagiotis Varlagas
va ****** @ yahoo.com

解决方案

我绝不是专家。不过,我会提出两条评论。


1)(V.FIELD4 ='''103''或V.FIELD4 =''100'')应该是

V.FIELD4 IN('''103'',''100'')


2)我没有看到第二个TABLE正在输出。也许一个EXISTS

条款比JOIN更合适。


B.


嗨Brian!


Wrt评论1),肯定是正确的,但它对特定查询的性能提升的贡献是b / b ,如果有的话。


我也在评论2)中提出你的建议。它没有提供任何

性能改进(查询仍然需要40'''才能运行),但

有两个好点:


(a)这是一种更简洁的方式来表达我想说的话。对于阅读查询并试图理解的人来说,这是更清楚的了。

它应该做什么


( b)它_does_改变Visual Explain中显示的路径。没有哈希加入

现在涉及...但是这个改变没有反映到改进的

性能


感谢您的帮助!

Panagiotis


> 1),肯定是正确的,但它对特定查询的性能改进有贡献是最小的,如果有的话。


这样就更容易阅读了。当我第一次看到


(V.FIELD4 ='''103''或V.FIELD4 ='''100'')

AND(V FIELD8 = 120 OR V.FIELD15 = 120)


我认为第二个子句也是两次FIELD8。我只注意到它

是第二眼看到的两个不同的领域。如果FIELD4使用了

IN,我可能不会犯这个错误。这里没什么大不了的,

只是IN让事情变得更清楚了。

但是这个改变没有反映到改进的性能



嗯...也许尝试使用IN代替?我认为EXISTS应该总是更好,但在很多情况下IN()的效果要好得多。


B.


I execute a query (against DB2 for iSeries), which, in its generic form
is as follows. This query runs just fine, executing in a couple of
seconds

SELECT V.FIELD01, V.FIELD02, V.FIELD03, V.FIELD04, V.FIELD05,
V.FIELD06, V.FIELD07, V.FIELD08, V.FIELD09, V.FIELD10,
V.FIELD11, V.FIELD12, V.FIELD13
FROM SCHEMA1.VIEW1 V
WHERE V.FIELD2 BETWEEN ''03/10/2005'' AND ''03/10/2006''
AND (V.FIELD4 = ''103'' OR V.FIELD4 = ''100'' )
AND (V.FIELD8 = 120 OR V.FIELD15= 120)
AND V.FIELD16 <> ''4''
ORDER BY V.FIELD05, V.FIELD12

Now, I modify the query, because some business reason dictates that I
do so. I perform a join of the original view with some table on a
particular (single) column and also add an OR clause in the WHERE
clause. The modifiied query looks as follows and (here is my problem!)
takes circa 40 seconds to run!

SELECT V.FIELD01, V.FIELD02, V.FIELD03, V.FIELD04, V.FIELD05,
V.FIELD06, V.FIELD07, V.FIELD08, V.FIELD09, V.FIELD10,
V.FIELD11, V.FIELD12, V.FIELD13
FROM SCHEMA1.VIEW1 V, SCHEMA2.TABLE1 T //<-- (1) Table added
WHERE T.FIELD1 = 1 and V.FIELD14 = T.FIELD2 //<-- (2) Join condition
AND V.FIELD2 BETWEEN ''03/10/2005'' AND ''03/10/2006''
AND (V.FIELD4 = ''103'' OR V.FIELD4 = ''100'' )
AND (V.FIELD8 = 120 OR V.FIELD15= 120 OR T.FIELD3 = 120)
// (3) Added an OR statement here --|
AND V.FIELD16 <> ''4''
ORDER BY V.FIELD05, V.FIELD12

I wonder why this, at first sight innocuous join, causes such a
deterioration in performance.
I''ve run both queries through the Visual Explain tool that comes with
the iSeries navigator. The bottleneck in the second case is the hash
join performed between the results of scanning the component tables of
view V and those of table T. The "Estimated Processing Time" for that
hash join is in the range of 40.

The graph in Visual Explain looks can be descibed as follows:
(1) "Index Scan - Key Positioning" is applied to two of the component
tables (say T1 and T2) of view V. A "Nested Loop Join" is applied to
the results
(2) "Table Scan" is applied to the remaining component table (say T3)
of V. A "Temporary Hash Table" is produced
[Note that this processing of V is the same as the one performed in the
original, simpler, query]
(3) "Table Scan, Parallel" is applied to table T, that has now come
into the picture. A "Temporary Hash Table" gets produced.

Then a grand hash join is produced from the results of (1), (2), (3)
that takes 41 seconds! How could I avoid this?

Many, many thanks in advance!!!

Panagiotis Varlagas
va******@yahoo.com

解决方案

I am by no means an expert. I will offer two comments, however.

1) (V.FIELD4 = ''103'' OR V.FIELD4 = ''100'' ) should probably be
V.FIELD4 IN (''103'', ''100'' )

2) I did not see the second TABLE being output. Perhaps an EXISTS
clause would be more appropriate than a JOIN.

B.


Hi Brian!

Wrt comment 1), surely it is a correct one, however its contribution to
performance improvement of the particular query is minimal, if any.

I applied your suggestion in comment 2) too. It did not provide any
performance improvent (the query still takes 40'''' to run), however
there are two good points about it:

(a) It is a more concise way of me expressing what I want to say. It is
much clearer to the person that reads the query and tries to understand
what it is supposed to do

(b) It _does_ change the path shown in Visual Explain. No hash joins
involved now... But this change is not reflected into improved
performance

Thanks for the help!
Panagiotis


>1), surely it is a correct one, however its contribution to performance improvement of the particular query is minimal, if any.

It''s is just easier to read that way. When i first looked at

(V.FIELD4 = ''103'' OR V.FIELD4 = ''100'' )
AND (V.FIELD8 = 120 OR V.FIELD15= 120)

I figured the second clause was also FIELD8 twice. I only noticed it
was two different fields on second look. Had the FIELD4 been using an
IN, i probably would not have made that mistake. Nothing major here,
just the point that IN makes things more clear.

But this change is not reflected into improved performance



Hmm... perhaps try using IN instead? I would think EXISTS should always
be better, but in many cases IN() works much much better.

B.


这篇关于查询性能 - 请帮助的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆