在猪中旋转 [英] Pivoting in Pig
问题描述
这与使用 Apache Pig 的数据透视表中的问题有关.我的输入数据为
This is related to the question in Pivot table with Apache Pig. I have the input data as
Id Name Value
1 Column1 Row11
1 Column2 Row12
1 Column3 Row13
2 Column1 Row21
2 Column2 Row22
2 Column3 Row23
并希望旋转并获得输出为
and want to pivot and get the output as
Id Column1 Column2 Column3
1 Row11 Row12 Row13
2 Row21 Row22 Row23
请让我知道如何在 Pig 中做到这一点.
Pls let me know how to do it in Pig.
推荐答案
在没有 UDF 的情况下,最简单的方法是在 Id 上分组,而不是在嵌套的 foreach 中为每个列名选择行,然后将它们加入生成.见脚本:
The simplest way to do it without UDF is to group on Id and than in nested foreach select rows for each of the column names, then join them in the generate. See script:
inpt = load '~/rows_to_cols.txt' as (Id : chararray, Name : chararray, Value: chararray);
grp = group inpt by Id;
maps = foreach grp {
col1 = filter inpt by Name == 'Column1';
col2 = filter inpt by Name == 'Column2';
col3 = filter inpt by Name == 'Column3';
generate flatten(group) as Id, flatten(col1.Value) as Column1, flatten(col2.Value) as Column2, flatten(col3.Value) as Column3;
};
输出:
(1,Row11,Row12,Row13)
(2,Row21,Row22,Row23)
另一种选择是编写一个 UDF,将 bag{name, value} 转换为 map[],而不是通过使用列名作为键来获取值(例如 vals#'Column1').
Another option would be to write a UDF which converts a bag{name, value} into a map[], than use get values by using column names as keys (Ex. vals#'Column1').
这篇关于在猪中旋转的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!