在 Hadoop MapReduce 中是否可以使用多个不同的映射器进行多个输入? [英] Is it possible to have multiple inputs with multiple different mappers in Hadoop MapReduce?

查看：20 发布时间：2022/1/13 23:38:25 hadoop mapreduce

本文介绍了在 Hadoop MapReduce 中是否可以使用多个不同的映射器进行多个输入?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

Hadoop MapReduce 中是否可以有多个输入和多个不同的映射器?每个映射器类都在一组不同的输入上工作，但它们都会发出由同一个 reducer 消耗的键值对.请注意，这里我不是在讨论链接映射器，而是在讨论并行运行不同的映射器，而不是顺序运行.

Is it possible to have multiple inputs with multiple different mappers in Hadoop MapReduce? Each mapper class work on a different set of inputs, but they would all emit key-value pairs consumed by the same reducer. Note that I'm not talking about chaining mappers here, I'm talking about running different mappers in parallel, not sequentially.

推荐答案

这称为连接.

您想使用 mapred.* 包(较旧，但仍受支持)中的映射器和缩减器.较新的包 (mapreduce.*) 只允许一个映射器输入.使用 mapred 包，您可以使用 MultipleInputs 类来定义连接:

You want to use the mappers and reducers in the mapred.* packages (older, but still supported). The newer packages (mapreduce.*) only allow for one mapper input. With the mapred packages, you use the MultipleInputs class to define the join:

MultipleInputs.addInputPath(jobConf, 
                     new Path(countsSource),       
                     SequenceFileInputFormat.class, 
                     CountMapper.class);
MultipleInputs.addInputPath(jobConf, 
                     new Path(dictionarySource), 
                     SomeOtherInputFormat.class, 
                     TranslateMapper.class);

jobConf.setJarByClass(ReportJob.class);
jobConf.setReducerClass(WriteTextReducer.class);

jobConf.setMapOutputKeyClass(Text.class);
jobConf.setMapOutputValueClass(WordInfo.class);

jobConf.setOutputKeyClass(Text.class);
jobConf.setOutputValueClass(Text.class);

这篇关于在 Hadoop MapReduce 中是否可以使用多个不同的映射器进行多个输入?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在 Hadoop MapReduce 中是否可以使用多个不同的映射器进行多个输入? [英] Is it possible to have multiple inputs with multiple different mappers in Hadoop MapReduce?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在 Hadoop MapReduce 中是否可以使用多个不同的映射器进行多个输入? [英] Is it possible to have multiple inputs with multiple different mappers in Hadoop MapReduce?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭