有没有办法在 Hive 中转置数据? [英] Is there a way to transpose data in Hive?

查看:37
本文介绍了有没有办法在 Hive 中转置数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Hive 中的数据可以转置吗?如在,行变成列,列是行?如果没有直接的功能,有没有办法分几步完成?

Can data in Hive be transposed? As in, the rows become columns and columns are the rows? If there is no function straight up, is there a way to do it in a couple of steps?

我有一张这样的桌子:

 | ID   |   Names   |  Proc1   |   Proc2 |  Proc3  |
 | 1    |    A1     |   x      |   b     |  f      |
 | 2    |    B1     |   y      |   c     |  g      |
 | 3    |    C1     |   z      |   d     |  h      |
 | 4    |    D1     |   a      |   e     |  i      |

我希望它是这样的:

 | A1   |   B1   |  C1   |   D1 |  
 | x    |    y   |   z   |   a  |
 | b    |    c   |   d   |   e  |
 | f    |    g   |   h   |   i  |

我一直在查找其他相关问题,他们都提到使用横向视图和爆炸,但是有没有办法选择性地为横向(ly)视图(ing)和爆炸(ing)选择列?

I have been looking up other related questions and they all mention using lateral views and explode, but is there a way to selectively choose columns for lateral(ly) view(ing) and explod(ing)?

此外,实现我想做的事情的粗略过程是什么?请帮帮我.谢谢!

Also, what might be the rough process to achieve what I would like to do? Please help me out. Thanks!

编辑:我一直在阅读此链接:https://cwiki.apache.org/Hive/languagemanual-lateralview.html 它显示了我想要实现的一半.链接中的第一个示例基本上是我想要的,只是我不希望行重复并希望它们作为列名.关于如何将数据转换为表单的任何想法,如果我执行 explode,它将导致我想要的输出,或者其他方式,即 explode导致另一个步骤,然后导致我想要的输出表.再次感谢!

Edit: I have been reading this link: https://cwiki.apache.org/Hive/languagemanual-lateralview.html and it shows me half of what I want to achieve. The first example in the link is basically what I'd like except that I don't want the rows to repeat and want them as column names. Any ideas on how to get the data to a form such that if I do an explode, it would result in my desired output, or the other way, ie, explode first to lead to another step that would then lead to my desired output table. Thanks again!

推荐答案

抱歉,我不知道在 hive 中开箱即用的方法.你接近爆炸等,但我认为它不能完成工作.

I don't know of a way out of the box in hive to do this, sorry. You get close with explode etc. but I don't think it can get the job done.

总的来说,从概念上讲,我认为在不提前知道目标表的列将是什么的情况下很难进行转置.这是真的,特别是对于 hive,因为元数据与数据库中的多少列、它们的类型、它们的名称等有关 - 元存储.而且,一般来说这是真的,因为事先不知道列,需要某种内存中的数据保存(好的,肯定有溢出),用户可能需要小心不要溢出内存等(就像动态在 hive 中分区).

Overall, conceptually, I think it's hard to a transpose without knowing what the columns of the destination table are going to be in advance. This is true, in particular for hive, because the metadata related to how many columns, their types, their names, etc. in a database - the metastore. And, it's true in general, because not knowing the columns beforehand, would require some sort of in-memory holding of data (ok, sure with spills) and users may need to be careful about not overflowing the memory and such (just like dynamic partitioning in hive).

无论如何,长话短说,如果您事先知道目标表的列,那就太好了.据我所知,hive 本身没有 set 命令,但是您可以在select 子句来转置数据.类似于 SQL - 如何转置?

In any case, long story short, if you know the columns of the destination table beforehand, life is good. There isn't a set command in hive per se, to the best of my knowledge, but you could use a bunch of if clauses and case statements (ugly I know, but that's how I have done the same in the past) in the select clause to transpose the data. Something along the lines of SQL - How to transpose?

请告诉我进展如何!

这篇关于有没有办法在 Hive 中转置数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆