将参数传递给 Hadoop 映射器 [英] Passing arguments to Hadoop mappers

查看:26
本文介绍了将参数传递给 Hadoop 映射器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用新的 Hadoop API 并寻找一种将一些参数(少量字符串)传递给映射器的方法.
我该怎么做?

I'm using new Hadoop API and looking for a way to pass some parameters (few strings) to mappers.
How can I do that?

此解决方案适用于旧 API:

JobConf job = (JobConf)getConf();
job.set("NumberOfDocuments", args[0]);

这里,NumberOfDocuments"是参数的名称,它的值是从命令行参数args[0]"中读取的.设置此参数后,您可以在 reducer 或 mapper 中检索其值,如下所示:

Here, "NumberOfDocuments" is the name of parameter and its value is read from "args[0]", a command line argument. Once you set this arguments, you can retrieve its value in reducer or mapper as follows:

private static Long N;
public void configure(JobConf job) {
     N = Long.parseLong(job.get("NumberOfDocuments"));
}

注意,棘手的部分是你不能这样设置参数:

Note, the tricky part is that you cannot set parameters like this:

Configuration con = new Configuration();
con.set("NumberOfDocuments", args[0]);

推荐答案

在main方法中设置所需参数如下或使用-D 命令行选项 在运行作业时.

In the main method set the required parameter as below or using the -D command line option while running the job.

Configuration conf = new Configuration();
conf.set("test", "123");

Job job = new Job(conf);

在mapper/reducer中获取参数为

In the mapper/reducer get the parameter as

Configuration conf = context.getConfiguration();
String param = conf.get("test");

这篇关于将参数传递给 Hadoop 映射器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆