了解地图语法 [英] Understanding map syntax
问题描述
我在理解应如何使用map
时遇到了一些问题.
I have some problems understanding how the map
should be used.
按照此教程我创建了一个文件,其中包含以下文字:
Following this tutorial I created a file containing the following text:
[open#apache]
[apache#hadoop]
该,我能够正确地加载该文件:
The, I was able to load that file without errors:
a = load 'data/file_name.txt' as (M:map [])
现在,如何获取所有"值"的列表?即
Now, how can I take the list of all the "values"? I.e.
(apache)
(hadoop)
此外,我刚刚开始学习Pig,因此所有提示都将非常有帮助.
Furthermore, I have just started to learn Pig, therefore every hints is going to be very helpful.
推荐答案
只有一种与地图进行交互的方法,即使用#
运算符.为了使其具有更多功能,您必须定义一些 UDFs 一个>.因此,地图真正可以在纯猪中使用的唯一方法是:
There is only one way to interact with a map, and that is to use the #
operator. In order for it to have more functionality, you'll have to define some UDFs. Therefore the only way a map can really be used in pure pig is like:
B = FOREACH A GENERATE M#'open' ;
将其作为输出产生
(apache)
()
请注意,#
之后的值是带引号的字符串,不能更改,必须在运行作业之前进行设置.
Note that the value after the #
is a quoted string, it cannot change and must be set before the you run the job.
此外,请注意,它会为第二行创建一个NULL,因为该映射不包含带有'open'字样的键.与在两个字符数组键和值的模式上使用FILTER稍有不同:
Also, notice that is creates a NULL for the second line, because that map does not contain a key with the vaule 'open'. This is slightly different then using FILTER on a schema of two chararrays key and value:
B = FILTER A BY key=='open' ;
哪个产生输出:
(open,apache)
如果只需要该值,则可以通过以下方式简单完成:
If only the value is desired, then it can be done simply by:
B = FOREACH (FILTER A BY key=='open') GENERATE value ;
哪个会产生:
(apache)
如果保留NULL很重要,也可以使用 bincond :
If keeping the NULLs is important, they can also be generated by using a bincond:
B = FOREACH A GENERATE (key=='open'?value:NULL) ;
与M#'open'
产生相同的输出.
根据我的经验,地图的限制性不是很高.
From my experience maps are not very useful because of how restrictive they are.
这篇关于了解地图语法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!