在Hadoop作业中找不到类 [英] Class not found in Hadoop job

查看:157
本文介绍了在Hadoop作业中找不到类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个map reduce作业,它从DocumentDB获取其输入.我已经在源代码的lib目录下添加了jar文件,并且在运行作业时还使用了-libjars.但是我仍然在jar文件中收到类未找到的错误.这是我的驱动程序的一部分

I have a map reduce job which gets its input from DocumentDB. I've added to jar files under the lib directory in my source code and also user the -libjars when running the job. but I still get the class not found error for a class in the jar file. Here is some part of my driver program

public class MapReduceDriver extends Configured implements Tool  {

public static void main(String[] args) throws Exception {

    int res = ToolRunner.run(new Configuration(), new MapReduceDriver(), args);
    System.exit(res);

}



@Override
public int run(String[] args) throws Exception {

    Configuration conf =  this.getConf();
    ....

使用-libjars时,我曾经将所需的jar文件放在本地驱动程序上,一次放在hdfs上,但是都没有起作用.如何确保-libjars正常工作?

When using the -libjars I once put the required jar files on the local driver and once on the hdfs but neither worked. How can I make sure that the -libjars works?

p.s.我正在使用2节点HDInsight群集(在Microsoft Azure中运行).

p.s. I'm using 2-node HDInsight cluster (running in Microsoft Azure).

这是我收到的错误消息

 Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.microsoft.azure.documentdb.hadoop.DocumentDBInputFormat not found
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1961)
    at org.apache.hadoop.mapreduce.task.JobContextImpl.getInputFormatClass(JobContextImpl.java:174)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:726)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.lang.ClassNotFoundException: Class com.microsoft.azure.documentdb.hadoop.DocumentDBInputFormat not found
    at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1867)
    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1959)
    ... 8 more

推荐答案

HDInsight使用的templton不支持libjars,因此您不能使用它 templton文档

HDInsight is using templton which doesn't have support for libjars, so you can't use that templton docs

此外,我假设您正在使用Powershell脚本构建自定义HDInsight群集. 您可以将所有具有依赖关系的罐子复制到 HADOOP_HOME +'\ share \ hadoop \ common \ lib 这将是hadoop lib文件夹.

Also, I'm assuming you are building a custom HDInsight cluster using a powershell script. You can copy all the jars with dependencies to HADOOP_HOME + '\share\hadoop\common\lib this would be the hadoop lib folder.

或者您可以直接使用发布的powershell脚本来更改包含依赖项jar的路径(将jar添加到azure blob包含的路径中,而只需替换路径) powershell脚本

Or you can directly use the powershell script published with changing the path that contains the dependency jars ( add your jars to an azure blob contains and just replace the path ) powershell script

这篇关于在Hadoop作业中找不到类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆