Hive中的转置数据集 [英] Transpose dataset in Hive

查看:738
本文介绍了Hive中的转置数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在Hive中转置变量,例如:

I'm trying to transpose a variable in Hive such as:

Id1  Id2 Event
 1    1   7
 2    2   3
 2    2   7

 Id1  Id2 Event_7 Event_3
  1    1   1
  2    2   1        1

以下是我到目前为止的内容:

Following is what I have so far:

 create temporary table event_trans as 
           select Id1, Id2,Event
           kv['3'] as Event_3,
           kv['7'] as Event_7
           from(
             select Id1, Id2, collect(Event, '1') as kv
             from event1
             group by Id1, Id2

             )t




错误:编译语句时出错:FAILED:ParseException行
1:84在'['附近'缺少EOF kv'

Error: Error while compiling statement: FAILED: ParseException line 1:84 missing EOF at '[' near 'kv'

我也很想知道如何在Hive中转置带有t等重复项的数据集o相同的输出:

I'm also interested to know how to transpose a dataset in Hive with duplicates such as to the same output:

Id1  Id2 Event
 1    1   7
 2    2   3
 2    2   7
 2    2   7

 Id1  Id2 Event_7 Event_3
  1    1   1
  2    2   1        1

感谢任何帮助!

推荐答案

在Hive SQL中,您可以有条件地执行聚合:

In Hive SQL, you can do conditional aggregation:

select 
    id1,
    id2,
    max(case when event = 7 then 1 end) event_7,
    max(case when event = 3 then 1 end) event_3
group by id1, id2
order by id1, id2

这篇关于Hive中的转置数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆