减速机多路输出 [英] Multiple Output in Reducer
问题描述
我正在开发简单的 map reduce 程序.我想在reducer 之后为键中的每个不同单词创建不同的文件.例如,在执行 Mapreduce 之后,我有类似
I am working on simple map reduce program. I want to create different files after reducer for each different word in the key. For example, after executing Mapreduce I have something like
优先级 1 x 2
优先级 1 和 2
优先级 1 z 2
优先级 2 x 2
priority2 x 2
优先级2 y 2
现在我想要减少阶段后的不同文件,说 Priority1 和 Priority2 根据优先级具有所有这些值.我正在使用 java,想知道在 reducer 中应该写什么以获得这种输出?
Now I want different files after reduce phase, saying Priority1 and Priority2 which have all these values according to the priority. I am using java and want to know what should be written in reducer for having this kind of output?
我只想知道这是否可能,或者它是否可以解决或解决这个问题?我使用的是 Hadoop 0.20.203,因此多输出不起作用.
I just want to know if this is even possible or if it is how to approach or solve this? I am using Hadoop 0.20.203 and hence multipleoutputs doesn't work.
任何指针都会有所帮助.谢谢您的帮助!阿图尔
Any pointers will be helpful. Thanks for the help! Atul
推荐答案
您需要先创建一个 partioner
类,根据您的条件进行分区.
You need to create a partioner
class first, that partions based on your criteria.
然后您需要创建自己的 outputformat
类和 recordwriter
类.
You then need to create your own outputformat
class and a recordwriter
class.
recordwriter
类,需要根据您的需要写入不同的文件.此外,如果您需要对值进行排序,请为您的关键字段创建 comarator
类.
The recordwriter
class, needs to write to different files as per your needs. Further if you need to sort your values create comparator
class for your key field.
这篇关于减速机多路输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!