了解地图语法 [英] Understanding map syntax

查看:53
本文介绍了了解地图语法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在理解应如何使用map时遇到了一些问题.

I have some problems understanding how the map should be used.

按照此教程我创建了一个文件,其中包含以下文字:

Following this tutorial I created a file containing the following text:

[open#apache]
[apache#hadoop]

该,我能够正确地加载该文件:

The, I was able to load that file without errors:

a = load 'data/file_name.txt' as (M:map [])

现在,如何获取所有""的列表?即

Now, how can I take the list of all the "values"? I.e.

(apache)
(hadoop) 

此外,我刚刚开始学习Pig,因此所有提示都将非常有帮助.

Furthermore, I have just started to learn Pig, therefore every hints is going to be very helpful.

推荐答案

只有一种与地图进行交互的方法,即使用#运算符.为了使其具有更多功能,您必须定义一些 UDFs .因此,地图真正可以在纯猪中使用的唯一方法是:

There is only one way to interact with a map, and that is to use the # operator. In order for it to have more functionality, you'll have to define some UDFs. Therefore the only way a map can really be used in pure pig is like:

B = FOREACH A GENERATE M#'open' ;

将其作为输出产生

(apache)
()

请注意,#之后的值是带引号的字符串,不能更改,必须在运行作业之前进行设置.

Note that the value after the # is a quoted string, it cannot change and must be set before the you run the job.

此外,请注意,它会为第二行创建一个NULL,因为该映射不包含带有'open'字样的键.与在两个字符数组键和值的模式上使用FILTER稍有不同:

Also, notice that is creates a NULL for the second line, because that map does not contain a key with the vaule 'open'. This is slightly different then using FILTER on a schema of two chararrays key and value:

B = FILTER A BY key=='open' ;

哪个产生输出:

(open,apache)

如果只需要该值,则可以通过以下方式简单完成:

If only the value is desired, then it can be done simply by:

B = FOREACH (FILTER A BY key=='open') GENERATE value ;

哪个会产生:

(apache)

如果保留NULL很重要,也可以使用 bincond :

If keeping the NULLs is important, they can also be generated by using a bincond:

B = FOREACH A GENERATE (key=='open'?value:NULL) ;

M#'open'产生相同的输出.

根据我的经验,地图的限制性不是很高.

From my experience maps are not very useful because of how restrictive they are.

这篇关于了解地图语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆