在Teradata中实现聚合 [英] Implement aggregation in Teradata

查看:175
本文介绍了在Teradata中实现聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想汇总2个字段proct_dt,dw_job_id,并按升序排序
我的场景可以通过使用下面的查询和结果来清除。

第一个查询: -

sel * from scratch.COGIPF_RUNREPORT_test1 order by proct_dt,dw_job_id where dw_job_id = 10309

输出: p>

  dw_job_id proct_dt start_ts end_ts time_diff 

1 10,309 2018-03-06 00:00:00 2018-03-06 07:04:18 2018-03-06 07:04:22.457000 0

2 10,309 2018- 03-06 00:00:00 2018-03-06 06:58:50 2018-03-06 06:58:51.029000 0

3 10,309 2018-03-07 00:00:00 2018- 03-07 06:35:36 2018-03-07 06:36:03.809000 1

4 10,309 2018-03-06 00:00:00 2018-03-06 07:00:35 2018- 03-06 07:00:40.702000 0

<5> 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03- 06 06:30:42.759000 0



6 10,309 2018-03-06 00:00:00 2018-03-06 07:10:27 2018-03-06 07 : 10:28.715000 0

7 10,309 2018-03-06 00:00:00 2018-03-06 06:59:44 2018-03-06 06:59: 48.315000 0

8 10,309 2018-03-06 00:00:00 2018-03-06 07:00:15 2018-03-06 07:00:00 15.086000 0

9 10,309 2018-03-06 00:00:00 2018-03-06 07:04:02 2018-03-06 07:04:02.925000 0 p>

第二个查询: -

sel * from scratch.fact_test order by proct_dt asc,dw_job_id asc where dw_job_id = 10309



结果: -

  dw_job_id proct_dt start_ts end_ts状态



<1> 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03 -06 06:30:42.759 12

2 10,309 2018-03-07 00:00:00 2018-03-07 06:35:36 2018-03-07 06:36:03.809 12



因此,在第二个查询中,我通过第一次出现proct_dt,start_ts,end_ts,从第一个表中获得了期望结果



请让我知道任何澄清
如果有人能够帮助实现这种情况,这将是非常好的。



谢谢,

解决方案

目前还不清楚你的解释需要什么,但看起来你希望每天获得第一轮工作,这很容易使用Row_Number:

  select * 
from scratch.COGIPF_RUNREPORT_test1
where dw_job_id = 10309
qualify
row_number()
over(由dw_job_id分区,proc_dt - 对于每个作业& date
by start_ts)= 1 - 只有第一次运行


I want to aggregate 2 fields proct_dt, dw_job_id in ascendinng order My scenario would be clear by using below queries and result.

First query :-

sel * from scratch.COGIPF_RUNREPORT_test1 order by proct_dt,dw_job_id where dw_job_id =10309

Output :-

dw_job_id   proct_dt           start_ts          end_ts                      time_diff  

1 10,309 2018-03-06 00:00:00 2018-03-06 07:04:18 2018-03-06 07:04:22.457000 0
2 10,309 2018-03-06 00:00:00 2018-03-06 06:58:50 2018-03-06 06:58:51.029000 0
3 10,309 2018-03-07 00:00:00 2018-03-07 06:35:36 2018-03-07 06:36:03.809000 1
4 10,309 2018-03-06 00:00:00 2018-03-06 07:00:35 2018-03-06 07:00:40.702000 0

5 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03-06 06:30:42.759000 0

6 10,309 2018-03-06 00:00:00 2018-03-06 07:10:27 2018-03-06 07:10:28.715000 0

7 10,309 2018-03-06 00:00:00 2018-03-06 06:59:44 2018-03-06 06:59:48.315000 0

8 10,309 2018-03-06 00:00:00 2018-03-06 07:00:15 2018-03-06 07:00:15.086000 0

9 10,309 2018-03-06 00:00:00 2018-03-06 07:04:02 2018-03-06 07:04:02.925000 0

2nd Query :-

sel * from scratch.fact_test order by proct_dt asc ,dw_job_id asc where dw_job_id =10309

Result :-

dw_job_id   proct_dt            start_ts              end_ts      status

1 10,309 2018-03-06 00:00:00 2018-03-06 06:30:25 2018-03-06 06:30:42.759 12

2 10,309 2018-03-07 00:00:00 2018-03-07 06:35:36 2018-03-07 06:36:03.809 12

So here in the 2nd query I got the desire result as from 1st table by having first occurrence of proct_dt,start_ts,end_ts

Please let me know for any clarification It would be very great if anyone can help to achieve this case.

Thanks,

解决方案

It's not clear what you want from your explanation, but it looks like you want to get the first run of a job per day, which is easy using a Row_Number:

select * 
from scratch.COGIPF_RUNREPORT_test1
where dw_job_id =10309
qualify
   row_number()
   over (partition by dw_job_id, proc_dt -- for each job & date
         order by start_ts) = 1          -- only the 1st run

这篇关于在Teradata中实现聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆