Hadoop中的setJarByClass() [英] setJarByClass() in Hadoop

查看:1687
本文介绍了Hadoop中的setJarByClass()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Hadoop算法的驱动程序方法中,我们将作业链接到设置为Mapper和Reducer的类的引用。例如:

At some point in the driver method of an Hadoop algorithm we link the job to the references of the classes set as Mapper and Reducer. For example:

        job.setMapperClass(MyMapper.class);
        job.setReducerClass(MyReducer.class);

通常驱动方法是 main mapper和reducer实现为内部静态类。

usually the driver method is the main while mapper and reducer are implemented as inner static classes.

假设 MyMapper.class MyReducer.class MyClass.class 的内部静态类,该驱动程序方法是 MyClass.class 的主要部分。有时我看到在上面两个之后添加了以下行:

Suppose that MyMapper.class and MyReducer.class are inner static classes of MyClass.class and that driver method is the main of MyClass.class. Sometime I see the following line added right after the two from above:

        job.setJarByClass(Myclass.class);

这个配置步骤的含义是什么,什么时候有用或者是强制性的?

what is the meaning of this configuration step and when it is useful or mandatory?

在我的情况下(我有一个单节点集群安装),如果我删除这一行,我可以继续正确运行该作业。为什么?

In my case (I have a single-node cluster installation), If I remove this line, I can continue to run the job correctly. Why?

推荐答案

这里我们帮助Hadoop找出它应该发送到节点以执行Map和Reduce任务的jar。我们的abc-jar.jar可能在其类路径中有各种其他jar,我们的驱动程序代码也可能位于与Mapper和Reducer类不同的jar或位置。

Here we help Hadoop to find out that which jar it should send to nodes to perform Map and Reduce tasks. Our abc-jar.jar might have various other jars in it's classpath, also our driver code might be in a separate jar or location than that of our Mapper and Reducer classes.

因此,使用这个setJarByClass方法,我们告诉Hadoop找出相关的jar,找出指定为该参数的类作为该jar的一部分存在。因此,通常我们应该提供MapperImplementation.class或您的Reducer实现或与Mapper和Reducer相同的jar中存在的任何其他类。还要确保Mapper和Reducer都是同一个jar的一部分。

Hence, using this setJarByClass method we tell Hadoop to find out the relevant jar by finding out that the class specified as it's parameter to be present as part of that jar. So usually we should provide either MapperImplementation.class or your Reducer implementation or any other class which is present in the same jar as that of Mapper and Reducer. Also make sure that both Mapper and Reducer are part of the same jar.

参考: http://www.bigdataspeak.com/2014/06/what-is-need-to-use-jobsetjarbyclass-in .html

这篇关于Hadoop中的setJarByClass()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆