在 Pig latin 中,无法将数据加载为多个元组,请指教 [英] In Pig latin, am not able to load data as multiple tuples, please advice

查看:37
本文介绍了在 Pig latin 中,无法将数据加载为多个元组,请指教的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法将数据加载为多个元组,不确定自己在做什么,请指教.

I am not able load the data as multiple tuples, am not sure what mistake am doing, please advise.

data.txt
vineet  1   pass    Govt
hisham  2   pass    Prvt
raj 3   fail    Prvt

我想将它们加载为 2 个元组.

I want to load them as 2 touples.

A = LOAD 'data.txt' USING PigStorage('\t') AS (T1:tuple(name:bytearray, no:int), T2:tuple(result:chararray, school:chararray));

A = LOAD 'data.txt' USING PigStorage('\t') AS (T1:(name:bytearray, no:int), T2:(result:chararray, school:chararray));

转储 A;以下数据以换行形式显示,我不知道为什么无法从 data.txt 中读取实际数据.

dump A; the below data is displayed in the form of new line, i dont know why am not able to read actual data from data.txt.

(,)
(,)
(,)

推荐答案

由于输入数据未存储为元组,我们将无法直接将其读入元组.

As the input data is not stored as tuple we wont be able to read it directly in to a tuple.

一种可行的方法是读取数据,然后形成一个包含必填字段的元组.

One feasible approach is to read the data and then form a tuple with required fields.

猪脚本:

A = LOAD 'a.csv' USING PigStorage('\t') AS (name:chararray,no:int,result:chararray,school:chararray);
B = FOREACH A GENERATE (name,no) AS T1:tuple(name:chararray, no:int), (result,school) AS T2:tuple(result:chararray, school:chararray);
DUMP B;

输入:a.csv

vineet  1   pass    Govt
hisham  2   pass    Prvt
raj 3   fail    Prvt

输出:转储 B:

((vineet,1),(pass,Govt))
((hisham,2),(pass,Prvt))
((raj,3),(fail,Prvt))

输出:DESCRIBE B:

B: {T1: (name: chararray,no: int),T2: (result: chararray,school: chararray)}

这篇关于在 Pig latin 中,无法将数据加载为多个元组,请指教的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆