无法在单词之间拆分包含空格和制表符的字符数组字段.帮助我使用 Apache Pig 执行命令? [英] Not able to split chararray field containing spaces and tabs between the words. Help me with the command using Apache Pig?

查看:27
本文介绍了无法在单词之间拆分包含空格和制表符的字符数组字段.帮助我使用 Apache Pig 执行命令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

2017-01-01 10:21:59 THURSDAY    -39 3 Pick up a bus - Travel for two hours
2017-02-01 12:45:19 FRIDAY  -55 8 Pick up a train - Travel for one hour
2017-03-01 11:35:49 SUNDAY  -55 8 Pick up a train - Travel for one hour
I
.
. 

当我执行建议的命令时,它被分成三个字段.

When I executed the suggested command, it got split into three fields.

当我执行以下操作时,它没有按预期工作.

when I do the below operation, it is not working as expected.

A = LOAD 'Sample.txt' USING PigStorage() as (line:chararray);
B = foreach A generate STRSPLIT(line, ' ', 3);
c = foreach B generate $2;
split C into buslog if $0 matches '.*bus*.', trainlog if $0 matches '.*train*.';

注意:- C 的转储将给出以下结果.

Note:- Dump of C will give below result.

THURSDAY    -39 3 Pick up a bus - Travel for two hours
FRIDAY  -55 8 Pick up a train - Travel for one hour
SUNDAY  -55 8 Pick up a train - Travel for one hour

要求:在上面的结果中,我想把火车和公共汽车分成两个关系,但没有按预期发生

Requirement: In the above result, i want to split train and bus into two relations, but it is not happening as expected

推荐答案

语法是.*string.*.注意两边都是.*字符串.

The syntax is .*string.*.Notice that it is .* on both sides of the string.

split C into buslog if $0 matches '.*bus.*', trainlog if $0 matches '.*train.*';

这篇关于无法在单词之间拆分包含空格和制表符的字符数组字段.帮助我使用 Apache Pig 执行命令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆