在HIVE中,使用COALESCE将Null值替换为相同的列值 [英] In HIVE replacing the Null value by the same column values using COALESCE
本文介绍了在HIVE中,使用COALESCE将Null值替换为相同的列值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想用我想获得结果的同一列中的值替换特定列的空值
I would like to replace the null value of a particular column by values in the same column I would like to get the result
我在下面尝试过
select
d_day,
COALESCE(val, LAST_VALUE(val, TRUE)
OVER( ORDER BY d_day ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW))
as val from data_table
推荐答案
一种方法是通过两个窗口函数,下面是一个示例:
One way to do it is by means of two windowing functions, here is an example:
with tmp_table as (
select 1 as ts, 3 as val
union all
select 2 as ts, NULL as val
union all
select 3 as ts, NULL as val
union all
select 4 as ts, 4 as val
union all
select 5 as ts, NULL as val
union all
select 6 as ts, 5 as val
union all
select 7 as ts, 6 as val
)
, rank_table as (
select *, SUM(val) OVER (ORDER BY ts ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as rnk
from tmp_table
)
select *, max(val) over (partition by rnk)
from rank_table
所以在你的情况下
with rank_table as (
select *, SUM(val) OVER (ORDER BY d_day ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as rnk
from your_table
)
select *, max(val) over (partition by rnk)
from rank_table
请记住,第一个 ORDER BY d_day
将使您的作业在单个reducer上运行,因此,如果您的数据确实很大,则可能需要一些时间才能完成.
Keep in mind that the first ORDER BY d_day
will make your job run on the single reducer, so if your data is really large it might take some time to finish up.
这篇关于在HIVE中,使用COALESCE将Null值替换为相同的列值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文