如何基于父链接对Oracle层次表执行计数? [英] How to perform counts on a Oracle Hierarchical Table based on Parent Link?
问题描述
我具有以下Oracle 11g R2分层表定义:
I have the following Oracle 11g R2 hierarchical table definition:
Table Name: TECH_VALUES:
ID,
GROUP_ID,
LINK_ID
PARENT_GROUP_ID,
TECH_TYPE
根据上面的分层表定义,一些示例数据可能看起来像这样:
Based on the above hierarchical table definition, some sample data might look like this:
ID GROUP_ID LINK_ID PARENT_GROUP_ID TECH_TYPE
------- ------------- ------------ -------------------- --------------
1 100 LETTER_A 0
2 200 LETTER_B 0
3 300 LETTER_C 0
4 400 LETTER_A1 100 A
5 500 LETTER_A2 100 A
6 600 LETTER_A3 100 A
7 700 LETTER_AA1 400 B
8 800 LETTER_AAA1 700 C
9 900 LETTER_B2 200 B
10 1000 LETTER_BB5 900 B
12 1200 LETTER_CC1 300 C
13 1300 LETTER_CC2 300 C
14 1400 LETTER_CC3 300 A
15 1500 LETTER_CCC5 1400 A
16 1600 LETTER_CCC6 1500 C
17 1700 LETTER_BBB8 900 B
18 1800 LETTER_B 0
19 1900 LETTER_B2 1800 B
20 2000 LETTER_BB5 1900 B
21 2100 LETTER_BBB8 1900 B
请记住,只有三种技术类型,即A,B和C,但是可以跨越不同的LINK_IDs
,我如何仅基于父链接ID对这三种不同的TECH_TYPEs
进行计数父组ID为0且在其下方有子级?
Keeping in mind that there are only three Tech Types, i.e. A, B and C but can span over over different LINK_IDs
, how can I perform a count on these three different TECH_TYPEs
based only on the parent link ids where the parent group id is 0 and there children below them?
也可能有重复的父链接ID,例如LETTER_B和所有相同的子值,但具有不同的GROUP ID.
It is also possible to have duplicate parent link ids, such as LETTER_B and all of the same children values but for different GROUP IDs.
我基本上是在一个看起来像这样的表/报表查询之后:
I am basically after a table/report query that looks like this:
ID Link ID Tech Type A Tech Type B Tech Type C Group ID
--- -------------- ------------------- -------------------- ------------ --------------
1 LETTER_A 3 1 1 100
2 LETTER_B 0 3 0 200
3 LETTER_C 2 0 3 300
4 LETTER_B 0 3 0 1800
由于是分层结构,并且我的表可能包含30,000多个记录,因此我还需要确保生成上面显示的报告查询的性能很快.
Being hierarchical and that my table may consist of over 30,000 records, I need to also ensure that the performance to produce the above report query shown here is fast.
很显然,为了生成上述报告,我需要针对所有父链接id(其中PARENT_GROUP_ID = 0
)收集基于TECH_TYPE
的所有必要计数明细,并将其按照此报告布局存储在表格中.
Obviously in order to produce the above report, I need to gather all necessary count breakdowns based on TECH_TYPE
for all parent link id where the PARENT_GROUP_ID = 0
and store these in a table as per this report layout .
我该如何编写一个组合查询来执行计数并将信息存储到名为LINK_COUNTS的新表中,该表将基于该表?该表的列为:
How can I write a combined query that performs the counts as well as stores the information into a new table called LINK_COUNTS, which this report will be based off? Columns for this table will be:
ID,
LINK_ID,
TECH_TYPE_A,
TECH_TYPE_B,
TECH_TYPE_C,
GROUP_ID
在整个需求的最后,我希望能够基于上面的示例数据返回的结果来更新LINK_COUNTS表,这只是一个SQL UPDATE事务的一部分,因为顶级父链接ID已存在于其中我的LINK_COUNTS表中,只需为每个父节点链接提供计数细分值,即
In the end of this whole requirement, I want to be able to update the LINK_COUNTS table based on the results returned from the sample data above as part of just one SQL UPDATE transaction as the top level parent link ids will already exist within my LINK_COUNTS table, just need to provide count breakdown values for each parent node link, i.e.
LETTER_A
LETTER_B
LETTER_C
LETTER_B
使用类似的内容:
UPDATE link_counts
SET (TECH_TYPE_A,TECH_TYPE_B,TECH_TYPE_C) =
(with xyz
where link_id = LINK_COUNTS.link_id
....
etc
推荐答案
要选择数据,请尝试以下操作:
To select the data try this:
with xyz as (
select tech_type,
connect_by_root(link_id) as root_link_id,
connect_by_root(group_id) as root_group_id
from tech_values
start with parent_group_id = 0
connect by prior group_id = parent_group_id
)
select root_link_id, root_group_id,
a_tech_type as tech_type_a, b_tech_type as tech_type_b, c_tech_type as tech_type_c
from xyz pivot (
count(1) as tech_type
for tech_type in (
'A' as A,
'B' as B,
'C' as C
))
;
在39000条记录的测试集上,用Oracle 11.2(和刷新的缓冲区高速缓存)的Intel Core i5大约花费了0.15秒的时间来计算.我希望它能尽快满足您的需求.
On a test set of 39000 records it took an Intel Core i5 with Oracle 11.2 (and flushed buffer cache) approx 0.15 sec to compute this. I hope it is quick enough to satisfy your needs.
要使用这些数据更新目标表,请在上述SELECT
中遍历eithr并迭代地更新link_counts
表,或者直接使用MERGE
语句:
To update the target table with these data, eithr loop through the above SELECT
and update your link_counts
table iteratively, or simply use the MERGE
statement:
merge into link_counts T
using (
with xyz as (
select tech_type,
connect_by_root(link_id) as root_link_id,
connect_by_root(group_id) as root_group_id
from tech_values X
start with parent_group_id = 0
connect by prior group_id = parent_group_id
)
select *
from xyz pivot (
count(1) as tech_type
for tech_type in (
'A' as A,
'B' as B,
'C' as C
))
) S
on ( T.group_id = S.root_group_id )
when matched then
update
set T.tech_type_a = S.a_tech_type,
T.tech_type_b = S.b_tech_type,
T.tech_type_c = S.c_tech_type
when not matched then
insert (link_id, group_id, tech_type_a, tech_type_b, tech_type_c)
values (S.root_link_id, S.root_group_id, S.a_tech_type, S.b_tech_type, S.c_tech_type)
;
MERGE
假定GROUP_ID
是link_counts
表中的主键/唯一键.原始问题中tech_values
表的实际定义支持此假设.
The MERGE
assumes that the GROUP_ID
is a primary/unique key in the link_counts
table. The actual definition of the tech_values
table in the original question supports this assumption.
在SQL * Plus中运行的测试的后台处理如下:
A spool of a test run in SQL*Plus is as follows:
SQL>
SQL> select count(1)
2 from user_indexes
3 where table_name = 'TECH_VALUES'
4 ;
COUNT(1)
----------
0
SQL>
SQL> select count(1)
2 from user_constraints
3 where table_name = 'TECH_VALUES'
4 and constraint_type != 'C'
5 ;
COUNT(1)
----------
0
SQL>
SQL> alter system flush buffer_cache;
System altered.
SQL> alter system flush shared_pool;
System altered.
SQL>
SQL> select systimestamp from dual;
SYSTIMESTAMP
---------------------------------------------------------------------------
08.06.14 23:18:40,053000 +02:00
SQL>
SQL> select tech_type, count(1)
2 from tech_values
3 where parent_group_id != 0
4 group by rollup(tech_type);
T COUNT(1)
- ----------
A 20048
B 39984
C 19984
80016
SQL>
SQL> select systimestamp from dual;
SYSTIMESTAMP
---------------------------------------------------------------------------
08.06.14 23:18:40,144000 +02:00
SQL>
SQL> alter system flush buffer_cache;
System altered.
SQL> alter system flush shared_pool;
System altered.
SQL>
SQL> select systimestamp from dual;
SYSTIMESTAMP
---------------------------------------------------------------------------
08.06.14 23:18:40,246000 +02:00
SQL>
SQL> with xyz as (
2 select connect_by_root(link_id) as root_link_id, connect_by_root(group_id) as root_group_id, tech_type
3 from tech_values X
4 start with parent_group_id = 0
5 connect by prior group_id = parent_group_id
6 )
7 select *
8 from xyz pivot (
9 count(1) as tech_type
10 for tech_type in (
11 'A' as A,
12 'B' as B,
13 'C' as C
14 ));
ROOT_LINK_ID ROOT_GROUP_ID A_TECH_TYPE B_TECH_TYPE C_TECH_TYPE
---------------------------------------------------------------- ------------- ----------- ----------- -----------
LETTER_B 1800 3667 7482 3854
LETTER_B 200 3712 7583 3708
LETTER_C 300 6326 12450 6229
LETTER_A 100 6343 12469 6193
SQL>
SQL> select *
2 from table(dbms_xplan.display_cursor())
3 ;
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 59hmrtw6q3f4u, child number 0
-------------------------------------
with xyz as ( select connect_by_root(link_id) as root_link_id,
connect_by_root(group_id) as root_group_id, tech_type from
tech_values X start with parent_group_id = 0 connect by prior
group_id = parent_group_id ) select * from xyz pivot ( count(1)
as tech_type for tech_type in ( 'A' as A,
'B' as B, 'C' as C ))
Plan hash value: 3833790953
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 563 (100)| |
| 1 | HASH GROUP BY PIVOT | | 29102 | 1420K| 563 (1)| 01:23:35 |
| 2 | VIEW | | 29102 | 1420K| 1044 (47)| 02:35:00 |
|* 3 | CONNECT BY WITH FILTERING| | | | | |
|* 4 | TABLE ACCESS FULL | TECH_VALUES | 4 | 84 | 187 (0)| 00:27:46 |
|* 5 | HASH JOIN | | 29098 | 966K| 375 (1)| 00:55:41 |
| 6 | CONNECT BY PUMP | | | | | |
| 7 | TABLE ACCESS FULL | TECH_VALUES | 80020 | 1641K| 187 (0)| 00:27:46 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("PARENT_GROUP_ID"=PRIOR NULL)
4 - filter("PARENT_GROUP_ID"=0)
5 - access("connect$_by$_pump$_002"."prior group_id "="PARENT_GROUP_ID")
31 rows selected.
SQL>
SQL> select systimestamp from dual;
SYSTIMESTAMP
---------------------------------------------------------------------------
08.06.14 23:18:40,805000 +02:00
SQL>
SQL> spool off
这篇关于如何基于父链接对Oracle层次表执行计数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!