猪参考 [英] Pig referencing

查看:101
本文介绍了猪参考的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  groupwordcount:{p> 

我学习Hadoop pig,并且始终坚持引用这些elements.please找到下面的示例。 group:chararray,words:{(bag_of_tokenTuples_from_line :: token:chararray)}}

有人可以解释一下如果我们有嵌套的元组和袋子,如何引用这些元素。



为了更好地理解嵌套的引用,任何链接都会很有帮助。

解决方案

让我们做一个简单的演示来理解这个问题。



表示存储在HDFS中'/tmp/a.txt'文件夹中的文件'a.txt'

A = LOAD'/tmp/a.txt'使用PigStorage(',')AS(名称:chararray,term:chararray,gpa:float);

转储A

(John,fl,3.9)

(John,fl,3.7)

(John,sp,4.0)
$ b

(Mary,fl,3.8)



(Mary,fl,3.9) (Mary,sp,4.0)

现在让我们以这个别名'A'为基础,根据一些参数say名字和术语来分组。
$ b

B = GROUP BY BY(名称,术语);

转储B;
$ b

(John,fl),{(John,fl,3.7),(John,fl,3.9)})

<(John,sm),{ ((John,sp),{(John,sp,4.0)})



<(Mary,fl),{(Mary,fl,3.9),(Mary,fl,3.8)})

((Mary,sm), {(Mary,sm,4.0)})

((Mary,sp),{(Mary,sp,4.0)})

描述B;



B:{group:(name:chararray,term:chararray),A:{(name:chararray,term:chararray,gpa:float)}}现在它已经成为你所问的问题陈述。让我演示如何访问组元组或tuple元素或两者兼有。



C = foreach B生成group.name,group.term,A .name,A.term,A.gpa;

dump C;



(John,fl,{(John),(John)},{(fl),(fl)},{(3.7),(3.9)})
$ b $(John,sm,{(John)},{(sm)},{(3.8)})
$ b $(John,sp,{( (Mary),(Mary)},{(fl)},{(sp)},{(4.0)})

,((fl)},{(3.9),(3.8)})

(Mary,sm,{(Mary)},{(sm)},{ 4.0)})



(Mary,sp,{(Mary)},{(sp)},{(4.0)})

所以我们通过这种方式访问​​了所有元素。



希望这有助于

I am learning Hadoop pig and I always stuck at referencing the elements.please find the below example.

groupwordcount: {group: chararray,words: {(bag_of_tokenTuples_from_line::token: chararray)}}

Can somebody please explain how to reference the elements if we have nested tuples and bags.

Any Links for better understanding the nested referrencing would be great help.

解决方案

Let's do a simple Demonstration to understand this problem.

say a file 'a.txt' stored at '/tmp/a.txt' folder in HDFS

A = LOAD '/tmp/a.txt' using PigStorage(',') AS (name:chararray,term:chararray,gpa:float);

Dump A;

(John,fl,3.9)

(John,fl,3.7)

(John,sp,4.0)

(John,sm,3.8)

(Mary,fl,3.8)

(Mary,fl,3.9)

(Mary,sp,4.0)

(Mary,sm,4.0)

Now let's group by this Alias 'A' on the basis of some parameter say name and term

B = GROUP A BY (name,term);

dump B;

((John,fl),{(John,fl,3.7),(John,fl,3.9)})

((John,sm),{(John,sm,3.8)})

((John,sp),{(John,sp,4.0)})

((Mary,fl),{(Mary,fl,3.9),(Mary,fl,3.8)})

((Mary,sm),{(Mary,sm,4.0)})

((Mary,sp),{(Mary,sp,4.0)})

describe B;

B: {group: (name: chararray,term: chararray),A: {(name: chararray,term: chararray,gpa: float)}}

now it has become the problem statement that you have asked. Let me demonstrate you how to access elements of group tuple or element of A tuple or both

C = foreach B generate group.name,group.term,A.name,A.term,A.gpa;

dump C;

(John,fl,{(John),(John)},{(fl),(fl)},{(3.7),(3.9)})

(John,sm,{(John)},{(sm)},{(3.8)})

(John,sp,{(John)},{(sp)},{(4.0)})

(Mary,fl,{(Mary),(Mary)},{(fl),(fl)},{(3.9),(3.8)})

(Mary,sm,{(Mary)},{(sm)},{(4.0)})

(Mary,sp,{(Mary)},{(sp)},{(4.0)})

So we accessed all elements by this way.

hope this helped

这篇关于猪参考的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆