性能问题:选择s之间的区别* vs select * [英] performance issue: difference between select s.* vs select *
问题描述
经过很长时间的努力之后,我终于发现带有选择前缀的查询如:
SELECT sth。* FROM某事物......
查询是以这种方式开始的300倍慢:
SELECT * FROM某事如此......
有人可以帮助我,并回答这是为什么吗?有些外部文件对此非常有用。
用于测试的表格为:
SALES_UNIT 表包含销售单位节点(如姓名等)的一些基本信息。唯一的关联是将表SALES_UNIT_TYPE作为ManyToOne。主键是ID和字段VALID_FROM_DTTM,它是date。
SALES_UNIT_RELATION 包含销售单位节点间的关系PARENT-CHILD。由SALES_UNIT_PARENT_ID,SALES_UNIT_CHILD_ID和VALID_TO_DTTM / VALID_FROM_DTTM组成。与任何表格没有关联。这里的PK是..PARENT_ID,..CHILD_ID和VALID_FROM_DTTM。
我使用的实际查询是:
SELECT s。*
FROM sales_unit s LEFT JOIN sales_unit_relation r
on(s.sales_unit_id = r.sales_unit_child_id)
WHERE r.sales_unit_child_id IS NULL
SELECT *
FROM sales_unit s LEFT JOIN sales_unit_relation r
on(s.sales_unit_id = r.sales_unit_child_id)
WHERE r.sales_unit_child_id IS NULL
同样的查询,两者都使用左连接,唯一不同的是使用select。
当然,它们是两个不同的查询。计划CAN可以随着选择的不同而改变。即在*中它可能正在选择左连接表上的完整/快速全索引扫描。而首先它可能是全表扫描。
为了进一步帮助您,我们能否看到计划?最好在SQL * PLUS中执行此操作
在
上设置时间在traceonly上设置autotrace
选择s。* from sales_unit s left join sales_unit_relation r on(s.sales_unit_id = r.sales_unit_child_id)其中r.sales_unit_child_id为空;
select * from sales_unit s left join sales_unit_relation r on(s.sales_unit_id = r.sales_unit_child_id)其中r.sales_unit_child_id为空;
编辑
给出你的解释计划,你会在每一步看到CARDINALITY = 1吗?当表格是空的时候,你已经收集了统计数据!看看这个:
SQL>选择s。* from sales_unit s left join sales_unit_relation r on(s.sales_unit_id = r.child_sales_unit_id)其中r.child_sales_unit_id为空;
未选择行
经过时间:00:00:03.19
执行计划
---------- ------------------------------------------------
计划散列值:1064670292
----------------------------------- -------------------------------------------------
| Id |操作|名称|行|字节|成本(%CPU)|时间|
---------------------------------------------- --------------------------------------
| 0 | SELECT语句| | 1 | 48 | 27(86)| 00:00:01 |
| 1 |巢巢的抗体| | 1 | 48 | 27(86)| 00:00:01 |
| 2 | TABLE ACCESS FULL | SALES_UNIT | 1 | 35 | 2(0)| 00:00:01 |
| * 3 | INDEX RANGE SCAN | SALES_REL_IX1 | 1 | 13 | 25(92)| 00:00:01 |
---------------------------------------------- --------------------------------------
谓词信息(确定通过操作id):
----------------------------------------- ----------
3 - 存取(S。SALES_UNIT_ID=R。CHILD_SALES_UNIT_ID)
统计数字
--------------------------------------------- -------------
1递归调用
0 db块获取
200314一致获取
2220物理读取
0重做大小
通过SQL * Net发送到客户端的297字节
通过SQL *从客户端接收到的339字节
1 SQL *往返客户端的净往返
0种类(内存)
0 sorts(disk)
0 rows processed
几秒钟。也可以在每一步中看到ROWS = 1(即完整的扫描)..让我们收集统计信息:
SQL> begin dbms_stats.gather_table_stats(user,'SALES_UNIT',degree => 8,cascade => true);结束;
2 /
PL / SQL过程已成功完成。
SQL> begin dbms_stats.gather_table_stats(user,'SALES_UNIT_RELATION',degree => 8,cascade => true);结束;
2 /
PL / SQL过程已成功完成。
现在重新运行:
SQL> select s。* from sales_unit s left join sales_unit_relation r on(s.sales_unit_id = r.child_sales_unit_id)其中r.child_sales_unit_id为空;
未选择行
经过:00:00:00.84
执行计划
--------------------------- -------------------------------
计划散列值:2005864719
- -------------------------------------------------- -------------------------------------------
| Id |操作|名称|行|字节| TempSpc |成本(%CPU)|时间|
---------------------------------------------- -------------------------------------------------
| 0 | SELECT语句| | 912 | 18240 | | 1659(3)| 00:00:20 |
| * 1 | HASH JOIN ANTI | | 912 | 18240 | 2656K | 1659(3)| 00:00:20 |
| 2 | TABLE ACCESS FULL | SALES_UNIT | 100K | 1472K | | 88(3)| 00:00:02 |
| 3 | INDEX快速全面扫描| SALES_REL_IX1 | 991K | 4841K | | 618(3)| 00:00:08 |
---------------------------------------------- -------------------------------------------------
谓词信息(由操作ID标识):
------------------------------ ---------------------
1 - access(S。SALES_UNIT_ID=R。CHILD_SALES_UNIT_ID)
统计资料
---------------------------------- ------------------------
递归调用
0 db块得到
2537一致得到
0物理读取
0重做大小
通过SQL * Net发送到客户端的$ 298字节
通过SQL *从客户端接收到的339字节
1 SQL *往返客户端的净往返
0排序(内存)
0排序(磁盘)
处理0行
SQL>
现在我们使用2537获取并且计划显示正确的ROWS和HASH连接(对我们来说更好需要)。我的测试表可能比你真正的测试表小,这就是为什么时间更接近
Recently I had some problem in performance of my query. The thing is described here: poor Hibernate select performance comparing to running directly - how debug? After long time of struggling, I've finally discovered that the query with select prefix like:
SELECT sth.* FROM Something as sth...
Is 300x times slower then query started this way:
SELECT * FROM Something as sth..
Could somebody help me, and answer why is that so? Some external documents on this would be really useful.
The table used for testing was:
SALES_UNIT table contains some basic info abot sales unit node such as name and etc. The only association is to table SALES_UNIT_TYPE, as ManyToOne. The primary key is ID and field VALID_FROM_DTTM which is date.
SALES_UNIT_RELATION contains relation PARENT-CHILD between sales unit nodes. Consists of SALES_UNIT_PARENT_ID, SALES_UNIT_CHILD_ID and VALID_TO_DTTM/VALID_FROM_DTTM. No association with any tables. The PK here is ..PARENT_ID, ..CHILD_ID and VALID_FROM_DTTM
The actual queries I've used were:
SELECT s.*
FROM sales_unit s LEFT JOIN sales_unit_relation r
on (s.sales_unit_id = r.sales_unit_child_id)
WHERE r.sales_unit_child_id IS NULL
SELECT *
FROM sales_unit s LEFT JOIN sales_unit_relation r
on (s.sales_unit_id = r.sales_unit_child_id)
WHERE r.sales_unit_child_id IS NULL
Same query, both uses left join and only difference is with select.
they are two different queries of course. the plan CAN change with the selects being different. i.e. in the sth.* it may be choosing a full/fast full index scan on the left joined table. whereas on the first it will possibly be a full table scan.
in order to help you further, can we see the plans please? preferably do this in SQL*PLUS
set timing on
set autotrace on traceonly
select s.* from sales_unit s left join sales_unit_relation r on (s.sales_unit_id = r.sales_unit_child_id) where r.sales_unit_child_id is null;
select * from sales_unit s left join sales_unit_relation r on (s.sales_unit_id = r.sales_unit_child_id) where r.sales_unit_child_id is null;
EDIT
given your explain plan, you see theres CARDINALITY=1 on every step? you've gathered stats when the tables were empty! see this:
SQL> select s.* from sales_unit s left join sales_unit_relation r on (s.sales_unit_id = r.child_sales_unit_id) where r.child_sales_unit_id is null;
no rows selected
Elapsed: 00:00:03.19
Execution Plan
----------------------------------------------------------
Plan hash value: 1064670292
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 48 | 27 (86)| 00:00:01 |
| 1 | NESTED LOOPS ANTI | | 1 | 48 | 27 (86)| 00:00:01 |
| 2 | TABLE ACCESS FULL| SALES_UNIT | 1 | 35 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | SALES_REL_IX1 | 1 | 13 | 25 (92)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("S"."SALES_UNIT_ID"="R"."CHILD_SALES_UNIT_ID")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
200314 consistent gets
2220 physical reads
0 redo size
297 bytes sent via SQL*Net to client
339 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
so see it used 200314 IO and took a few seconds. Also see ROWS = 1 on every step (i.e the full scans)..lets gather stats:
SQL> begin dbms_stats.gather_table_stats(user, 'SALES_UNIT', degree=>8, cascade=>true); end;
2 /
PL/SQL procedure successfully completed.
SQL> begin dbms_stats.gather_table_stats(user, 'SALES_UNIT_RELATION', degree=>8, cascade=>true); end;
2 /
PL/SQL procedure successfully completed.
and now rerun: SQL> select s.* from sales_unit s left join sales_unit_relation r on (s.sales_unit_id = r.child_sales_unit_id) where r.child_sales_unit_id is null;
no rows selected
Elapsed: 00:00:00.84
Execution Plan
----------------------------------------------------------
Plan hash value: 2005864719
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 912 | 18240 | | 1659 (3)| 00:00:20 |
|* 1 | HASH JOIN ANTI | | 912 | 18240 | 2656K| 1659 (3)| 00:00:20 |
| 2 | TABLE ACCESS FULL | SALES_UNIT | 100K| 1472K| | 88 (3)| 00:00:02 |
| 3 | INDEX FAST FULL SCAN| SALES_REL_IX1 | 991K| 4841K| | 618 (3)| 00:00:08 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("S"."SALES_UNIT_ID"="R"."CHILD_SALES_UNIT_ID")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2537 consistent gets
0 physical reads
0 redo size
297 bytes sent via SQL*Net to client
339 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
SQL>
now we used 2537 gets only and the plan shows the right ROWS and a HASH join (better for our needs). my test tables are probably smaller than your real ones which is why the timings are closer
这篇关于性能问题:选择s之间的区别* vs select *的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!