找不到Mapper类 [英] Mapper class not found
问题描述
而且我必须给job.setJarByClass(MyMapper.class);告诉它从我的jar文件中加载它。
cloudera @ cloudera -vm:/ tmp / translator $ hadoop jar MapReduceJobs.jar translator / input / Portuguese.txt translator / output
13/06/13 03:36:57警告mapred.JobClient:没有作业jar文件集。用户类可能找不到。请参阅JobConf(Class)或JobConf#setJar(String)。
13/06/13 03:36:57 INFO input.FileInputFormat:要输入的总输入路径:1
13/06/13 03:36:57信息mapred.JobClient:正在运行的作业:job_201305100422_0043
13/06/13 03:36:58信息mapred.JobClient:map 0%reduce 0%
13/06/13 03:37:03信息mapred.JobClient:任务ID:attempt_201305100422_0043_m_000000_0,状态: FAILED
java.lang.RuntimeException:java.lang.ClassNotFoundException:com.mapreduce.variousformats.keyvaluetextinputformat.MyMapper
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996)
at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:212)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:601)
问题:为什么会发生。为什么它不总是告诉我从我的jar文件中加载它。
是否有解决这类问题的一些最佳实践。另外,如果我使用了一些第三方库,我也必须为他们做这件事。
确保添加在提交作业时对 HADOOP_CLASSPATH
和 -libjars
的任何依赖关系如下例所示:
使用以下命令添加所有来自(例如)current和 lib
目录的jar依赖关系:
export HADOOP_CLASSPATH = $ HADOOP_CLASSPATH:`echo * .jar`:`echo lib / *。jar | sed's / /:/ g'`
请注意,当通过 hadoop jar
您还需要通过使用 -libjars
将它传递给任何依赖关系的罐子。我喜欢使用:
hadoop jar< jar> <类> -libjars`echo ./lib/*.jar | sed's / /,/ g'` [args ...]
sed
命令需要不同的分隔符; HADOOP_CLASSPATH
为:
分隔, -libjars
需要将,
分开。
Sometimes my MR job complains that MyMapper class in not found. And that i have to give job.setJarByClass(MyMapper.class); to tell it to load it from my jar file.
cloudera@cloudera-vm:/tmp/translator$ hadoop jar MapReduceJobs.jar translator/input/Portuguese.txt translator/output 13/06/13 03:36:57 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 13/06/13 03:36:57 INFO input.FileInputFormat: Total input paths to process : 1 13/06/13 03:36:57 INFO mapred.JobClient: Running job: job_201305100422_0043 13/06/13 03:36:58 INFO mapred.JobClient: map 0% reduce 0% 13/06/13 03:37:03 INFO mapred.JobClient: Task Id : attempt_201305100422_0043_m_000000_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: com.mapreduce.variousformats.keyvaluetextinputformat.MyMapper at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:212) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:601)
Question : Why does it happen. Why doesn't it always tell me to load it from my jar file. Is there some best practices for tackling these kind of issues. Also if i am using some 3rd party libraries , will i have to do this for them as well.
Be sure to add any dependencies to both the HADOOP_CLASSPATH
and -libjars
upon submitting a job like in the following examples:
Use the following to add all the jar dependencies from (for example) current and lib
directories:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`
Bear in mind that when starting a job through hadoop jar
you'll need to also pass it the jars of any dependencies through use of -libjars
. I like to use:
hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]
NOTE: The sed
commands require a different delimiter character; the HADOOP_CLASSPATH
is :
separated and the -libjars
need to be ,
separated.
这篇关于找不到Mapper类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!