strsplit问题-猪 [英] strsplit issue - Pig

查看:76
本文介绍了strsplit问题-猪的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下元组H1,我想将其$ 0拆分为元组,但是我总是收到一条错误消息:

I have following tuple H1 and I want to strsplit its $0 into tuple.However I always get an error message:

DUMP H1:
(item32;item31;,1)

m = FOREACH H1 GENERATE STRSPLIT($0, ";", 50);

错误1000:解析期间发生错误.第1行第40列的词法错误. 遇到:之后:"\";"

ERROR 1000: Error during parsing. Lexical error at line 1, column 40. Encountered: after : "\";"

任何人都知道脚本出了什么问题吗?

Anyone knows what's wrong with the script?

推荐答案

当猪遇到这种分号时,其解析例程中存在一个转义问题.

There is an escaping problem in the pig parsing routines when it encounters this semicolon.

您可以对分号使用转义序列:\u003B.但是,也必须使用斜杠转义并将其放在单引号中.或者,您可以按照尼尔的回答,用多行代码重写命令.在所有情况下,都必须使用单引号引起来.

You can use a unicode escape sequence for a semicolon: \u003B. However this must also be slash escaped and put in a single quoted string. Alternatively, you can rewrite the command over multiple lines, as per Neil's answer. In all cases, this must be a single quoted string.

H1 = LOAD 'h1.txt' as (splitme:chararray, name);

A1 = FOREACH H1 GENERATE STRSPLIT(splitme,'\\u003B'); -- OK
B1 = FOREACH H1 GENERATE STRSPLIT(splitme,';');       -- ERROR
C1 = FOREACH H1 GENERATE STRSPLIT(splitme,':');       -- OK
D1 = FOREACH H1 {                                     -- OK
    splitup = STRSPLIT( splitme, ';' );
    GENERATE splitup;
}

A2 = FOREACH H1 GENERATE STRSPLIT(splitme,"\\u003B"); -- ERROR
B2 = FOREACH H1 GENERATE STRSPLIT(splitme,";");       -- ERROR
C2 = FOREACH H1 GENERATE STRSPLIT(splitme,":");       -- ERROR
D2 = FOREACH H1 {                                     -- ERROR
    splitup = STRSPLIT( splitme, ";" );
    GENERATE splitup;
}

Dump H1;
(item32;item31;,1)

Dump A1;
((item32,item31))

Dump C1;
((item32;item31;))

Dump D1;
((item32,item31))

这篇关于strsplit问题-猪的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆