蜂巢-将值范围之间的一行拆分为多行 [英] hive - split a row into multiple rows between the range of values

查看:74
本文介绍了蜂巢-将值范围之间的一行拆分为多行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在下面有一张桌子,想按从开始到结束的范围对行进行拆分.

I have a table below and would like to split the rows by the range from start to end columns.

,即id和value应该在开始&两次之间重复每个值结束(包括两端)

i.e id and value should repeat for each value between start & end(both inclusive)

--------------------------------------
id      | value   | start     | end
--------------------------------------
1       | 5       | 1         | 4
2       | 8       | 5         | 9
--------------------------------------

所需的输出

--------------------------------------
id      | value   | current
--------------------------------------
1       | 5       | 1
1       | 5       | 2
1       | 5       | 3
1       | 5       | 4

2       | 8       | 5
2       | 8       | 6
2       | 8       | 7
2       | 8       | 8
2       | 8       | 9
--------------------------------------

我可以在Java/python中编写自己的UDF以得到此结果,但想检查是否可以使用任何现有的Hive UDF在Hive SQL中实现

I can write my own UDF in java/python to get this result but would like to check if I can implement in Hive SQL using any existing hive UDFs

谢谢.

推荐答案

您可以使用posexplode()UDF.

You can do using posexplode() UDF.

WITH
data AS (
  SELECT 1 AS id, 5 AS value, 1 AS start, 4 AS `end`
  UNION ALL
  SELECT 2 AS id, 8 AS value, 5 AS start, 9 AS `end`
)
SELECT distinct id, value, (zr.start+rge.diff) as `current`
   FROM data zr LATERAL VIEW posexplode(split(space(zr.`end`-zr.start),' ')) rge as diff, x

这是它的输出:

+-----+--------+----------+--+
| id  | value  | current  |
+-----+--------+----------+--+
| 1   | 5      | 1        |
| 1   | 5      | 2        |
| 1   | 5      | 3        |
| 1   | 5      | 4        |
| 2   | 8      | 5        |
| 2   | 8      | 6        |
| 2   | 8      | 7        |
| 2   | 8      | 8        |
| 2   | 8      | 9        |
+-----+--------+----------+--+

这篇关于蜂巢-将值范围之间的一行拆分为多行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆