将不同数量的行分组为Hive表中的列 [英] Group varying number of rows as columns in Hive table

查看:203
本文介绍了将不同数量的行分组为Hive表中的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Hive表,其中包含用户标识和一些变量选项,基本如下所示:

I have a Hive table that contains userIDs and some variable choice, and basically looks like this:

userID    selection
   1          A
   1          D
   1          F
   2          A
   2          C

我想要做的就是浓缩这些信息,最终得到如下结果:

What I would like to do is condense this information and end up with something like:

 userID    selection1    selection2    selection3
    1          A             D              F
    2          A             C

这甚至可能吗?我不清楚怎么做这个分组,因为可能的选择数量随用户而异。

Is this even possible? It isn't clear to me how to do this grouping, given that the number of possible selections varies with the user.

如果我能做点什么如:

It would even be fine if I could do something like:

 userID    selection 
    1        A,D,F    
    2         A,C     

我尝试了几种方法,但至今没有足够的描述。我想我想要的是以下形式:

I have tried several approaches but so far nothing has been close enough to describe. I think what I want is something of the form:

select userID, group_concat(selection) from table_name group by userID

但是据我所知,group_concat函数不可用。

but as far as I can tell the group_concat function isn't available.

谢谢!

Thanks!

推荐答案

如果任何人最终需要答案这可以通过以下方式实现:

In case anyone ends up needing the answer to this, it can be achieved via:

select userID, collect_set(selection) from table_name group by userID

这篇关于将不同数量的行分组为Hive表中的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆