未找到映射器类 [英] Mapper class not found

查看:29
本文介绍了未找到映射器类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有时我的 MR 工作会抱怨找不到 MyMapper 类.我必须给 job.setJarByClass(MyMapper.class);告诉它从我的 jar 文件中加载它.

Sometimes my MR job complains that MyMapper class in not found. And that i have to give job.setJarByClass(MyMapper.class); to tell it to load it from my jar file.

cloudera@cloudera-vm:/tmp/translator$ hadoop jar MapReduceJobs.jar 翻译器/输入/Portuguese.txt 翻译器/输出13/06/13 03:36:57 WARN mapred.JobClient:没有设置作业 jar 文件.可能找不到用户类.请参阅 JobConf(Class) 或 JobConf#setJar(String).13/06/13 03:36:57 INFO input.FileInputFormat:要处理的总输入路径:113/06/13 03:36:57 信息 mapred.JobClient:正在运行的作业:job_201305100422_004313/06/13 03:36:58 信息 mapred.JobClient: 地图 0% 减少 0%13/06/13 03:37:03 信息 mapred.JobClient:任务 ID:尝试_201305100422_0043_m_000000_0,状态:失败java.lang.RuntimeException: java.lang.ClassNotFoundException: com.mapreduce.variousformats.keyvaluetextinputformat.MyMapper在 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996)在 org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:212)在 org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:601)

cloudera@cloudera-vm:/tmp/translator$ hadoop jar MapReduceJobs.jar translator/input/Portuguese.txt translator/output 13/06/13 03:36:57 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 13/06/13 03:36:57 INFO input.FileInputFormat: Total input paths to process : 1 13/06/13 03:36:57 INFO mapred.JobClient: Running job: job_201305100422_0043 13/06/13 03:36:58 INFO mapred.JobClient: map 0% reduce 0% 13/06/13 03:37:03 INFO mapred.JobClient: Task Id : attempt_201305100422_0043_m_000000_0, Status : FAILED java.lang.RuntimeException: java.lang.ClassNotFoundException: com.mapreduce.variousformats.keyvaluetextinputformat.MyMapper at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:212) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:601)

问题:为什么会这样.为什么它不总是告诉我从我的 jar 文件中加载它.是否有一些解决此类问题的最佳实践.另外,如果我使用一些 3rd 方库,我是否也必须为它们这样做.

Question : Why does it happen. Why doesn't it always tell me to load it from my jar file. Is there some best practices for tackling these kind of issues. Also if i am using some 3rd party libraries , will i have to do this for them as well.

推荐答案

确保在提交作业时将任何依赖项添加到 HADOOP_CLASSPATH-libjars 中,例如以下示例:

Be sure to add any dependencies to both the HADOOP_CLASSPATH and -libjars upon submitting a job like in the following examples:

使用以下命令从(例如)当前目录和 lib 目录添加所有 jar 依赖项:

Use the following to add all the jar dependencies from (for example) current and lib directories:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`

请记住,当通过 hadoop jar 开始工作时,您还需要通过使用 -libjars 将任何依赖项的 jar 传递给它.我喜欢用:

Bear in mind that when starting a job through hadoop jar you'll need to also pass it the jars of any dependencies through use of -libjars. I like to use:

hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]

注意: sed 命令需要不同的分隔符;HADOOP_CLASSPATH: 分隔的,-libjars 需要是 , 分隔的.

NOTE: The sed commands require a different delimiter character; the HADOOP_CLASSPATH is : separated and the -libjars need to be , separated.

这篇关于未找到映射器类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆