为什么检查null会减慢此查询的速度? [英] Why does checking for null slow this query down?

查看:102
本文介绍了为什么检查null会减慢此查询的速度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我得到了包含7,000条记录的这张表

desc ARADMIN.V_PKGXMLCODE

Name                  Null     Type          
--------------------- -------- ------------- 
REQUEST_ID            NOT NULL VARCHAR2(15)  
AVAILABILITY                   VARCHAR2(69)  
XML_CODE                       CLOB          
PACKAGENAME_UNIQUE             VARCHAR2(50)  
CATALOG                        NUMBER(15)    
CHILD                          VARCHAR2(255) 
CLASSIFICATION_SYSTEM          NUMBER(15)    
E_MAIL                         VARCHAR2(69)

查询

SELECT COUNT(*) FROM ARADMIN.V_PKGXMLCODE WHERE (CATALOG <> 0 AND CATALOG <> 2) AND (NOT (CHILD IS NULL));

花费不到一秒钟的时间.

查询

SELECT COUNT(*) FROM ARADMIN.V_PKGXMLCODE WHERE (CATALOG IS NULL OR (CATALOG <> 0 AND CATALOG <> 2)) AND (NOT (CHILD IS NULL));

需要23秒.

解释计划,但是声称它应该很快实现...

我该怎么办?

解决方案

我认为获得执行速度差异的唯一方法是(a)在field4上有一个索引,以及(b) 很多为空的数据块;可能是由于反复的直接路径载荷将水位设置得很高而造成的.

第一个查询仍将使用索引并按预期执行.但是由于没有索引空值,因此无法使用索引来检查or field4 is null条件,因此它将退回到全表扫描.

这本身不应该成为问题,因为对7000行进行全表扫描不会花费很长时间.但是,由于 花费了很长时间,所以还有其他事情在发生.全表扫描必须检查分配给该表的每个数据块,以查看它们是否包含任何行,并且花费的时间表明,即使使用内联CLOB存储,也要容纳7000行的块要多得多. >

获取大量空数据块的最简单方法是拥有大量数据,然后删除其中的大部分数据.但是,我相信您在一个先前删除的问题中说过,该问题以前表现还不错,而且情况变得更糟.如果您直接路径插入,尤其是如果您通过删除数据然后在直接路径模式下插入新数据来刷新"数据时.您可以使用具有/*+ append */提示的插入来执行此操作;或并行或通过SQL * Loader.每次您这样做时,高水位线都会移动,因为旧的空块将不会被重复使用.并且每次查询空值的查询性能都会有所下降.经过很多次迭代之后,这些积淀才真正开始.

您可以检查数据字典以查看为表分配了多少空间(user_segments等),并将其与您实际拥有的数据大小进行比较.您可以通过重建表格来重置HWM,例如:

alter table mytable move;

(最好在维护窗口中!)

作为一个演示,我运行了一个循环以直接路径进行一百次插入和删除7000行,然后运行了两个查询.第一个花费了0.06秒(其中大部分是SQL Devleoper的开销);第二个花费1.260. (我也经营Gordon's,时间也差不多,因为它仍然需要进行FTS).随着迭代次数的增加,差异将变得更加明显,但我的空间不足了……我执行了alter table move并重新运行了第二个查询,然后花费了0.05秒.

I got this table containing 7,000 records

desc ARADMIN.V_PKGXMLCODE

Name                  Null     Type          
--------------------- -------- ------------- 
REQUEST_ID            NOT NULL VARCHAR2(15)  
AVAILABILITY                   VARCHAR2(69)  
XML_CODE                       CLOB          
PACKAGENAME_UNIQUE             VARCHAR2(50)  
CATALOG                        NUMBER(15)    
CHILD                          VARCHAR2(255) 
CLASSIFICATION_SYSTEM          NUMBER(15)    
E_MAIL                         VARCHAR2(69)

The query

SELECT COUNT(*) FROM ARADMIN.V_PKGXMLCODE WHERE (CATALOG <> 0 AND CATALOG <> 2) AND (NOT (CHILD IS NULL));

takes less than one second.

The query

SELECT COUNT(*) FROM ARADMIN.V_PKGXMLCODE WHERE (CATALOG IS NULL OR (CATALOG <> 0 AND CATALOG <> 2)) AND (NOT (CHILD IS NULL));

takes 23 seconds.

Explain plan however claims it should go real quick...

What can I do?

解决方案

The only way I can think to get that kind of difference in execution speed would be to (a) have an index on field4, and (b) have a lot of empty data blocks; possibly from a high water mark set very high by repeated direct-path loads.

The first query would still use the index and perform as expected. But as null values are not indexed, the index cannot be used to check the or field4 is null condition, so it would fall back to a full table scan.

That in itself shouldn't be a problem here, as a full table scan of 7000 rows shouldn't take long. But since it is taking so long, something else is going on. A full table scan has to examine every data block allocated to the table to see if they contain any rows, and the time it's taking suggests there are a lot more blocks than you need to hold 7000 rows, even with inline CLOB storage.

The simplest way to get a lot of empty data blocks is to have a lot of data and then delete most of it. But I believe you said in a now-deleted comment on an earlier question that performance used to be OK and has got worse. That can happen if you do direct-path inserts, particularly if you 'refresh' data by deleting it and then inserting new data in direct-path mode. You could be doing that with inserts that have the /*+ append */ hint; or in parallel; or through SQL*Loader. Each time you did that the high water mark would move, as old empty blocks wouldn't be reused; and each time performance of the query that checks for nulls would degrade a little. After a lot of iterations that would really start to add up.

You can check the data dictionary to see how much space is allocated to your table (user_segments etc.), and compare that to the size of the data you think you actually have. You can reset the HWM by rebuilding the table, e.g by doing:

alter table mytable move;

(preferably in a maintenance window!)

As a demo I ran a cycle to direct-path insert and delete 7000 rows over a hundred times, and then ran both your queries. The first one took 0.06 seconds (much of which is SQL Devleoper overhead); the second took 1.260. (I also ran Gordon's, which got a similar time, as it still has to do a FTS). With more iterations the difference would become even more marked, but I ran out of space... I then did an alter table move and re-ran your second query, which then took 0.05 seconds.

这篇关于为什么检查null会减慢此查询的速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆