如何基于时间戳(具有几秒钟的差异)联接两个表? [英] How to join two tables based on a timestamp (with variance of a few seconds)?
问题描述
我有两个表要结合并插入到基于三列组合的另一个表中.我会解释.
I have two tables that I'm trying to join and insert into another table based on a combination of three columns. I'll explain.
表M
| ANO | BNO | Timestamp | Duration
---------------------------------------------------------------
| 5612853 | 4732621 | 21.11.2013 09:50:58 | 196
| 4842988 | 5610953 | 21.11.2013 17:34:58 | 98
| 7765759 | 5612853 | 21.11.2013 20:48:00 | 377
| 2470321 | 2470263 | 21.11.2013 21:47:18 | 125
表N
| ANO | BNO | Timestamp | Duration
---------------------------------------------------------------
| 5612853 | 4732621 | 21.11.2013 09:50:52 | 196
| 4842988 | 5610953 | 21.11.2013 17:34:53 | 98
| 7765759 | 5612853 | 21.11.2013 20:47:55 | 377
| 2470321 | 2470263 | 21.11.2013 21:47:13 | 125
现在这两个表必须匹配,并根据
Now these two tables have to be matched and inserted into Table MN based on
M.ANO=N.ANO and M.BNO=N.BNO and ((M.TIMESTAMP = N.TIMESTAMP+5/86400) or (M.TIMESTAMP = N.TIMESTAMP+6/86400))
所以从理论上讲,我的输出表MN 应该是
So in theory, my output Table MN should be
| ANO | BNO | Timestamp | Duration || ANO | BNO | Timestamp | Duration
--------------------------------------------------------------------------------------------------------------
| 5612853 | 4732621 | 21.11.2013 09:50:58 | 196 || 5612853 | 4732621 | 21.11.2013 09:50:52 | 196
| 4842988 | 5610953 | 21.11.2013 17:34:58 | 98 || 4842988 | 5610953 | 21.11.2013 17:34:53 | 98
| 7765759 | 5612853 | 21.11.2013 20:48:00 | 377 || 7765759 | 5612853 | 21.11.2013 20:47:55 | 377
| 2470321 | 2470263 | 21.11.2013 21:47:18 | 125 || 2470321 | 2470263 | 21.11.2013 21:47:13 | 125
表M 具有约140万条记录,而表N 具有约90万条记录.
Table M has about 1.4 million records, and Table N has about 0.9 million.
我已尝试根据以下两个查询将两个表连接起来.但是执行需要花费几个小时,如果我必须每天运行一次,那是不可行的.
I've tried to join the two tables based on the below two queries. But it takes hours to execute and that isn't feasible if I have to run this on a daily basis.
INSERT INTO MN_RECON (
SELECT M.*,N.* FROM M FULL OUTER JOIN N ON
M.ANO=N.ANO AND M.BNO=N.BNO AND
((M.TIMESTAMP=N.TIMESTAMP+5/86400) OR (M.TIMESTAMP=N.TIMESTAMP+6/86400))
INSERT INTO MN_RECON (
SELECT M.*,N.* FROM M FULL OUTER JOIN N ON
M.ANO=N.ANO AND M.BNO=N.BNO AND
(M.TIMESTAMP-N.TIMESTAMP IN (5/86400,6/86400)
当我仅运行上述2个查询的SELECT语句时,我在一分钟内得到输出(仅几百个示例行),但是添加INSERT会花费很长时间. 有没有一种方法可以优化我想做的事?
When I run just the SELECT statement of the above 2 queries I get an output within a minute (just a few 100 sample lines) but with the INSERT added it takes a very long time. Is there a way to optimize what I want to do?
我需要在时间戳上进行匹配,因为白天可能多次出现相同的ANO-BNO组合,并且时间戳是它们之间的唯一标识符
I need it to match on timestamp because there can be multiple occurrences of the same ANO - BNO combination during the day, with the timestamp being the unique identifier between them
我需要一个完整的外部联接,因为我需要专注于不匹配的记录以及两个表之间持续时间不同的匹配记录.
And I need a full outer join because I need to focus on records that are not matched, as well as matched records with a difference in duration between the two tables.
其他Oracle信息 Oracle Database 11g企业版11.2.0.3.0 64位生产
Additional Oracle information Oracle Database 11g Enterprise Edition 11.2.0.3.0 64-bit Production
EXPLAIN PLAN
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | INSERT STATEMENT | | 2386K| 530M| | 2395M (1)|999:59:59 |
| 1 | LOAD TABLE CONVENTIONAL | MN_RECON | | | | | |
| 2 | VIEW | | 2386K| 530M| | 2395M (1)|999:59:59 |
| 3 | UNION-ALL | | | | | | |
|* 4 | HASH JOIN RIGHT OUTER| | 1417K| 109M| 49M| 10143 (1)| 00:02:02 |
| 5 | TABLE ACCESS FULL | N_VOICE | 968K| 38M| | 1753 (1)| 00:00:22 |
| 6 | TABLE ACCESS FULL | M_VOICE | 1417K| 52M| | 2479 (1)| 00:00:30 |
|* 7 | FILTER | | | | | | |
| 8 | TABLE ACCESS FULL | N_VOICE | 968K| 38M| | 1754 (1)| 00:00:22 |
|* 9 | TABLE ACCESS FULL | M_VOICE | 1 | 29 | | 2479 (1)| 00:00:30 |
推荐答案
加快查询速度的一种简单方法是创建基于函数的索引:
A simple way to speed up the query is by creating a function-based index:
CREATE INDEX indexname1 ON N (timestamp+5/86400);
CREATE INDEX indexname2 ON N (timestamp+6/86400);
这篇关于如何基于时间戳(具有几秒钟的差异)联接两个表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!