如何将系统属性传递给hadoop中的map函数 [英] How to pass system property to map function in hadoop

查看:181
本文介绍了如何将系统属性传递给hadoop中的map函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法在hadoop map reduce框架中传递系统参数(类似-Dmy_param = XXX)来映射函数。
通过.setJarByClass()完成向hadoop集群的作业提交。
在映射器中,我必须创建配置,所以我想让它变得容易,所以我认为通过属性文件的标准方式可以。只需在设置属性的时候传递参数就可以了。另一种方法是将属性文件添加到提交的jar文件中。有人有经验如何解决这个问题吗?

解决方案

如果您还没有在工作中使用过它,可以尝试GenericOptionsParser,Tool和ToolRunner来运行Hadoop作业。



注意: MyDriver扩展了Configured并实现了Tool。
然后,运行你的工作使用这个

  hadoop -jar somename.jar MyDriver -D your.property = value arg1 arg2 

欲了解更多信息,检查这个链接

以下是我为您准备的一些示例代码:

    public class MyDriver扩展Configured implements工具{

public static class MyDriverMapper扩展Mapper< LongWritable,Text,LongWritable,NullWritable> {

protected void map(LongWritable key,Text value,Context context)
throws IOException,InterruptedException {
//在映射器中,您可以检索您设置的任何配置
//从终端开始作业时,如下所示

配置conf = context.getConfiguration();
String yourPropertyValue = conf.get(your.property);



public static class MyDriverReducer扩展了Reducer< LongWritable,NullWritable,LongWritable,NullWritable> {
$ b $ protected void reduce(LongWritable key,Iterable< NullWritable> values,Context context)
throws IOException,InterruptedException {
// ---一些代码---


$ b $ public static void main(String [] args)throws Exception {
int exitCode = ToolRunner.run(new MyDriver(),args);
System.exit(exitCode);
}

@Override
public int run(String [] args)throws Exception {
Configuration conf = getConf();
//如果你想要,你也可以在这里设置/设置。
// your.property也可以是文件位置,
之后//检索属性并将它们逐个设置为conf对象。

// - 其他代码 - //
作业作业=新作业(conf,我的作业样例);
// ---其他代码--- //
return(job.waitForCompletion(true)?0:1);
}
}


is there a way how to pass system parameter (something like -Dmy_param=XXX) to map function in hadoop map reduce framework. Submission of job to hadoop cluster is done via .setJarByClass(). In mapper I have to create configuration so I would like to make it configerable so I thought that standard way via property file would be ok. Just struggling with passing parameter where the property is set. Another way would be to add property file to submitted jar. Does someone have an experience how solve that?

解决方案

If you haven't already used this in your job, you can try GenericOptionsParser, Tool, and ToolRunner for running Hadoop Job.

Note: the MyDriver extends Configured and implements Tool. And, to run you job use this

hadoop -jar somename.jar MyDriver -D your.property=value arg1 arg2

For more information, check this link.

Here's some sample code I prepared for you:

public class MyDriver extends Configured implements Tool {

  public static class MyDriverMapper extends Mapper<LongWritable, Text, LongWritable, NullWritable> {

    protected void map(LongWritable key, Text value, Context context)
      throws IOException, InterruptedException {
      // In the mapper you can retrieve any configuration you've set
      // while starting the job from the terminal as shown below

      Configuration conf = context.getConfiguration();
      String yourPropertyValue = conf.get("your.property");
    }
  }

  public static class MyDriverReducer extends Reducer<LongWritable, NullWritable, LongWritable, NullWritable> {

    protected void reduce(LongWritable key, Iterable<NullWritable> values, Context context) 
      throws IOException, InterruptedException {
      // --- some code ---
    }
  }

  public static void main(String[] args) throws Exception {
    int exitCode = ToolRunner.run(new MyDriver(), args);
    System.exit(exitCode);
  }

  @Override
  public int run(String[] args) throws Exception {
    Configuration conf = getConf();
    // if you want you can get/set to conf here too.
    // your.property can also be file location and after
    // you retrieve the properties and set them one by one to conf object.

    // --other code--//
    Job job = new Job(conf, "My Sample Job");
    // --- other code ---//
    return (job.waitForCompletion(true) ? 0 : 1);
  }
}

这篇关于如何将系统属性传递给hadoop中的map函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆