减速机多路输出 [英] Multiple Output in Reducer

查看:18
本文介绍了减速机多路输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发简单的 map reduce 程序.我想在reducer 之后为键中的每个不同单词创建不同的文件.例如,在执行 Mapreduce 之后,我有类似

I am working on simple map reduce program. I want to create different files after reducer for each different word in the key. For example, after executing Mapreduce I have something like

优先级 1 x 2

优先级 1 和 2

优先级 1 z 2

优先级 2 x​​ 2

priority2 x 2

优先级2 y 2

现在我想要减少阶段后的不同文件,说 Priority1 和 Priority2 根据优先级具有所有这些值.我正在使用 java,想知道在 reducer 中应该写什么以获得这种输出?

Now I want different files after reduce phase, saying Priority1 and Priority2 which have all these values according to the priority. I am using java and want to know what should be written in reducer for having this kind of output?

我只想知道这是否可能,或者它是否可以解决或解决这个问题?我使用的是 Hadoop 0.20.203,因此多输出不起作用.

I just want to know if this is even possible or if it is how to approach or solve this? I am using Hadoop 0.20.203 and hence multipleoutputs doesn't work.

任何指针都会有所帮助.谢谢您的帮助!阿图尔

Any pointers will be helpful. Thanks for the help! Atul

推荐答案

您需要先创建一个 partioner 类,根据您的条件进行分区.

You need to create a partioner class first, that partions based on your criteria.

然后您需要创建自己的 outputformat 类和 recordwriter 类.

You then need to create your own outputformat class and a recordwriter class.

recordwriter 类,需要根据您的需要写入不同的文件.此外,如果您需要对值进行排序,请为您的关键字段创建 comarator 类.

The recordwriter class, needs to write to different files as per your needs. Further if you need to sort your values create comparator class for your key field.

这篇关于减速机多路输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆