用最接近的值填充表中缺少的日期值的配置单元SQL查询 [英] Hive SQL query to fill missing date values in table with nearest values
本文介绍了用最接近的值填充表中缺少的日期值的配置单元SQL查询的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我花了几天时间试图弄清楚如何在配置单元中将丢失的日期与最接近的值相加,但没有运气。我需要基于环境约束使用配置单元SQL来实现这一点。原始表当前类似于下表。
account name,available balance,Date of balance
Peter,50000,2021-05-24
Peter,50035,2021-05-25
Peter,50035,2021-05-26
Peter,50610,2021-05-28
Peter,51710,2021-06-01
Peter,53028.1,2021-06-02
Peter,53916.1,2021-06-03
Mary,50000,2021-05-24
Mary,50035,2021-05-25
Mary,53028.1,2021-05-30
我需要的是将上面的表格转换为以下链接中的表格:
account name,available balance,Date of balance
Peter,50000,2021-05-24
Peter,50035,2021-05-25
Peter,50035,2021-05-26
Peter,50035,2021-05-27
Peter,50610,2021-05-28
Peter,50610,2021-05-29
Peter,50610,2021-05-30
Mary,50000,2021-05-24
Mary,50035,2021-05-25
Mary,50035,2021-05-26
Mary,50035,2021-05-27
Mary,50035,2021-05-28
Mary,50035,2021-05-29
Mary,53028.1,2021-05-30
请任何人分享配置单元SQL逻辑以进行此更改?
推荐答案
使用Lead()函数获取下一个日期,计算天数差异,获取长度为Diff in Days的空格字符串,拆分,使用POSE EXPLETDE生成行,使用Position to Add to Date获取缺少的日期:
with mytable as (--Demo dataset, use your table instead of this
select stack(10, --number of tuples
'Peter',float(50000),'2021-05-24',
'Peter',float(50035),'2021-05-25',
'Peter',float(50035),'2021-05-26',
'Peter',float(50610),'2021-05-28',
'Peter',float(51710),'2021-06-01',
'Peter',float(53028.1),'2021-06-02',
'Peter',float(53916.1),'2021-06-03',
'Mary',float(50000),'2021-05-24',
'Mary',float(50035),'2021-05-25',
'Mary',float(53028.1),'2021-05-30'
) as (account_name,available_balance,Date_of_balance)
) --use your table instead of this CTE
select account_name, available_balance, date_add(Date_of_balance,e.i) as Date_of_balance
from
( --Get next_date to generate date range
select account_name,available_balance,Date_of_balance,
lead(Date_of_balance,1, Date_of_balance) over (partition by account_name order by Date_of_balance) next_date
from mytable d --use your table
) s lateral view outer posexplode(split(space(datediff(next_date,Date_of_balance)-1),'')) e as i,x --generate rows
order by account_name desc, Date_of_balance --this is to have order of rows like in your Converted Table
结果:
account_name available_balance date_of_balance
Peter 50000 2021-05-24
Peter 50035 2021-05-25
Peter 50035 2021-05-26
Peter 50035 2021-05-27
Peter 50610 2021-05-28
Peter 50610 2021-05-29
Peter 50610 2021-05-30
Peter 50610 2021-05-31
Peter 51710 2021-06-01
Peter 53028.1 2021-06-02
Peter 53916.1 2021-06-03
Mary 50000 2021-05-24
Mary 50035 2021-05-25
Mary 50035 2021-05-26
Mary 50035 2021-05-27
Mary 50035 2021-05-28
Mary 50035 2021-05-29
Mary 53028.1 2021-05-30
这篇关于用最接近的值填充表中缺少的日期值的配置单元SQL查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文