在HIVE中,使用COALESCE将Null值替换为相同的列值 [英] In HIVE replacing the Null value by the same column values using COALESCE

查看:127
本文介绍了在HIVE中,使用COALESCE将Null值替换为相同的列值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用我想获得结果的同一列中的值替换特定列的空值

I would like to replace the null value of a particular column by values in the same column I would like to get the result

我在下面尝试过

select  
    d_day,
    COALESCE(val, LAST_VALUE(val, TRUE) 
    OVER( ORDER BY d_day ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)) 
    as val from data_table

推荐答案

一种方法是通过两个窗口函数,下面是一个示例:

One way to do it is by means of two windowing functions, here is an example:

with tmp_table as (
  select 1 as ts, 3 as val 
  union all
  select 2 as ts, NULL as val
  union all 
  select 3 as ts, NULL as val
  union all
  select 4 as ts, 4 as val
  union all
  select 5 as ts, NULL as val
  union all
  select 6 as ts, 5 as val
  union all 
  select 7 as ts, 6 as val
)
, rank_table as ( 
select *, SUM(val) OVER (ORDER BY ts ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as rnk
  from tmp_table
)
select *, max(val) over (partition by rnk)
  from rank_table

所以在你的情况下

with rank_table as ( 
select *, SUM(val) OVER (ORDER BY d_day ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) as rnk
  from your_table
)
select *, max(val) over (partition by rnk)
  from rank_table

请记住,第一个 ORDER BY d_day 将使您的作业在单个reducer上运行,因此,如果您的数据确实很大,则可能需要一些时间才能完成.

Keep in mind that the first ORDER BY d_day will make your job run on the single reducer, so if your data is really large it might take some time to finish up.

这篇关于在HIVE中,使用COALESCE将Null值替换为相同的列值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆