在Teradata中实现聚合 [英] Implement aggregation in Teradata
问题描述
我想汇总2个字段proct_dt,dw_job_id,并按升序排序
我的场景可以通过使用下面的查询和结果来清除。
sel * from scratch.COGIPF_RUNREPORT_test1 order by proct_dt,dw_job_id where dw_job_id = 10309
输出: p>
dw_job_id proct_dt start_ts end_ts time_diff
1 10,309 2018-03-06 00:00:00 2018-03-06 07:04:18 2018-03-06 07:04:22.457000 0
2 10,309 2018- 03-06 00:00:00 2018-03-06 06:58:50 2018-03-06 06:58:51.029000 0
3 10,309 2018-03-07 00:00:00 2018- 03-07 06:35:36 2018-03-07 06:36:03.809000 1
4 10,309 2018-03-06 00:00:00 2018-03-06 07:00:35 2018- 03-06 07:00:40.702000 0
<5> 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03- 06 06:30:42.759000 0
6 10,309 2018-03-06 00:00:00 2018-03-06 07:10:27 2018-03-06 07 : 10:28.715000 0
7 10,309 2018-03-06 00:00:00 2018-03-06 06:59:44 2018-03-06 06:59: 48.315000 0
8 10,309 2018-03-06 00:00:00 2018-03-06 07:00:15 2018-03-06 07:00:00 15.086000 0
9 10,309 2018-03-06 00:00:00 2018-03-06 07:04:02 2018-03-06 07:04:02.925000 0 p>
第二个查询: -
结果: -
dw_job_id proct_dt start_ts end_ts状态
<1> 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03 -06 06:30:42.759 12
2 10,309 2018-03-07 00:00:00 2018-03-07 06:35:36 2018-03-07 06:36:03.809 12
因此,在第二个查询中,我通过第一次出现proct_dt,start_ts,end_ts,从第一个表中获得了期望结果
请让我知道任何澄清
如果有人能够帮助实现这种情况,这将是非常好的。
谢谢,
目前还不清楚你的解释需要什么,但看起来你希望每天获得第一轮工作,这很容易使用Row_Number:
select *
from scratch.COGIPF_RUNREPORT_test1
where dw_job_id = 10309
qualify
row_number()
over(由dw_job_id分区,proc_dt - 对于每个作业& date
by start_ts)= 1 - 只有第一次运行
I want to aggregate 2 fields proct_dt, dw_job_id in ascendinng order My scenario would be clear by using below queries and result.
First query :-
sel * from scratch.COGIPF_RUNREPORT_test1 order by proct_dt,dw_job_id where dw_job_id =10309
Output :-
dw_job_id proct_dt start_ts end_ts time_diff
1 10,309 2018-03-06 00:00:00 2018-03-06 07:04:18 2018-03-06 07:04:22.457000 0
2 10,309 2018-03-06 00:00:00 2018-03-06 06:58:50 2018-03-06 06:58:51.029000 0
3 10,309 2018-03-07 00:00:00 2018-03-07 06:35:36 2018-03-07 06:36:03.809000 1
4 10,309 2018-03-06 00:00:00 2018-03-06 07:00:35 2018-03-06 07:00:40.702000 0
5 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03-06 06:30:42.759000 0
6 10,309 2018-03-06 00:00:00 2018-03-06 07:10:27 2018-03-06 07:10:28.715000 0
7 10,309 2018-03-06 00:00:00 2018-03-06 06:59:44 2018-03-06 06:59:48.315000 0
8 10,309 2018-03-06 00:00:00 2018-03-06 07:00:15 2018-03-06 07:00:15.086000 0
9 10,309 2018-03-06 00:00:00 2018-03-06 07:04:02 2018-03-06 07:04:02.925000 0
2nd Query :-
sel * from scratch.fact_test order by proct_dt asc ,dw_job_id asc where dw_job_id =10309
Result :-
dw_job_id proct_dt start_ts end_ts status
1 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03-06 06:30:42.759 12
2 10,309 2018-03-07 00:00:00 2018-03-07 06:35:36 2018-03-07 06:36:03.809 12
So here in the 2nd query I got the desire result as from 1st table by having first occurrence of proct_dt,start_ts,end_ts
Please let me know for any clarification It would be very great if anyone can help to achieve this case.
Thanks,
It's not clear what you want from your explanation, but it looks like you want to get the first run of a job per day, which is easy using a Row_Number:
select *
from scratch.COGIPF_RUNREPORT_test1
where dw_job_id =10309
qualify
row_number()
over (partition by dw_job_id, proc_dt -- for each job & date
order by start_ts) = 1 -- only the 1st run
这篇关于在Teradata中实现聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!