了解地图语法 [英] Understanding map syntax
问题描述
我在理解 map
应该如何使用时遇到了一些问题.
I have some problems understanding how the map
should be used.
按照这个教程我创建了一个包含以下文字:
Following this tutorial I created a file containing the following text:
[open#apache]
[apache#hadoop]
,我能够毫无错误地加载该文件:
The, I was able to load that file without errors:
a = load 'data/file_name.txt' as (M:map [])
现在,我如何获取所有值"的列表?即
Now, how can I take the list of all the "values"? I.e.
(apache)
(hadoop)
此外,我刚刚开始学习 Pig,因此每个提示都会非常有帮助.
Furthermore, I have just started to learn Pig, therefore every hints is going to be very helpful.
推荐答案
与地图交互的方式只有一种,那就是使用 #
运算符.为了让它有更多的功能,你必须定义一些UDFs一>.因此,地图可以真正用于纯猪的唯一方法是:
There is only one way to interact with a map, and that is to use the #
operator. In order for it to have more functionality, you'll have to define some UDFs. Therefore the only way a map can really be used in pure pig is like:
B = FOREACH A GENERATE M#'open' ;
将其作为输出生成:
(apache)
()
请注意,#
后面的值是一个带引号的字符串,它不能更改,必须在运行作业之前设置.
Note that the value after the #
is a quoted string, it cannot change and must be set before the you run the job.
另外,请注意,它为第二行创建了一个 NULL,因为该映射不包含值为 'open' 的键.这与在两个字符数组键和值的模式上使用 FILTER 略有不同:
Also, notice that is creates a NULL for the second line, because that map does not contain a key with the vaule 'open'. This is slightly different then using FILTER on a schema of two chararrays key and value:
B = FILTER A BY key=='open' ;
产生输出:
(open,apache)
如果只需要值,那么可以简单地通过:
If only the value is desired, then it can be done simply by:
B = FOREACH (FILTER A BY key=='open') GENERATE value ;
产生:
(apache)
如果保留 NULL 很重要,它们也可以通过使用 bincond:
If keeping the NULLs is important, they can also be generated by using a bincond:
B = FOREACH A GENERATE (key=='open'?value:NULL) ;
产生与 M#'open'
相同的输出.
Which produces the same output as M#'open'
.
根据我的经验,地图不是很有用,因为它们的限制很大.
From my experience maps are not very useful because of how restrictive they are.
这篇关于了解地图语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!