在Pig中合并两行 [英] Merge two lines in Pig

查看:91
本文介绍了在Pig中合并两行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为下面的查询编写一个猪脚本.

I would like to write a pig script for below query.

输入为:

ABC,DEF,,
,,GHI,JKL
MNO,PQR,,
,,STU,VWX

输出应为:

ABC,DEF,GHI,JKL
MNO,PQR,STU,VWX

有人可以帮我吗?

推荐答案

使用本机Pig解决该问题将很困难.一种选择是下载datafu-1.2.0.jar库并尝试以下方法.

It will be difficult to solve this problem using native pig. One option could be download the datafu-1.2.0.jar library and try the below approach.

input.txt

ABC,DEF,,
,,GHI,JKL
MNO,PQR,,
,,STU,VWX

PigScript:

REGISTER /tmp/datafu-1.2.0.jar;
DEFINE BagSplit datafu.pig.bags.BagSplit();

A = LOAD 'input.txt' USING PigStorage(',') AS(f1,f2,f3,f4);
B = GROUP A ALL;
C = FOREACH B GENERATE FLATTEN(BagSplit(2,$1)) AS mybag;
D = FOREACH C GENERATE FLATTEN(STRSPLIT(REPLACE(BagToString(mybag),'_null_null_null_null',''),'_',4));
E = FOREACH D GENERATE $2,$3,$0,$1;
DUMP E;

输出:

(MNO,PQR,STU,VWX)
(ABC,DEF,GHI,JKL)

注意: 根据上面的输入格式,我的假设是第一行最后两个列将为空,第二行前两个列将为空,对于第三行和第四行也是如此

Note: Based on the above input format, my assumption will be 1st row last two cols will be null, 2nd row first two cols will be null, similarly for 3rd and 4th row also

这篇关于在Pig中合并两行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆