使用 Apache Pig 扩展数组 [英] Expand an array with Apache Pig
问题描述
我正在使用 Apache pig 分析数据,但无法找到扩展数组的方法.这是我正在使用的架构,以及所需输出的示例:
I'm analyzing data with Apache pig and could not find a way to expand an array if items. Here is the schema I'm working with, and an example of the desired output:
(col1:int, col2:int, items:{ARRAY_ELEM:(name:chararray, total:int)})
input = (1, 1, {("bird", 5), ("bear", 12), ("wolf", 10)})
output = (1, 1, "bird", 5, "bear", 12, "wolf", 10)
有没有办法进行这种转换?
Is there any way to do this transformation?
感谢您的帮助!
推荐答案
如果您现在需要进行此转换,最简单的方法可能是在 Python 或 Java 中执行 UDF(我不是知道任何内置解决方案).
If you need to do this transformation right now the easiest way is probably to do a UDF in Python or Java (I am not aware of any built-in solution).
但是,大多数情况下,最好在每条记录中保持相同数量的列(例如,将数组保存为包或元组,不要在一个记录中展平"它).
However, most of the time it is better to keep the same number of columns in each record (e.g. keep your array as a bag or tuple and don't "flatten" it in one record).
这篇关于使用 Apache Pig 扩展数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!