Hadoop map-reduce操作在写输出失败 [英] Hadoop map-reduce operation is failing on writing output

查看:337
本文介绍了Hadoop map-reduce操作在写输出失败的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我终于可以在Hadoop上开始一个map-reduce工作(运行在一个debian机器上)。但是,map reduce作业总是失败,出现以下错误:

  hadoopmachine @ debian:〜$ ./hadoop-1.0.1 / bin / hadoop jar hadooptest / main.jar nl.mydomain.hadoop.debian.test.Main / user / hadoopmachine / input / user / hadoopmachine / output 
警告:$ HADOOP_HOME已被弃用。

12/04/03 07:29:35 WARN mapred.JobClient:使用GenericOptionsParser来解析参数。应用程序应该实现相同的工具。
**** hdfs:// localhost:9000 / user / hadoopmachine / input
12/04/03 07:29:35 INFO input.FileInputFormat:要处理的总输入路径:1
12/04/03 07:29:35 INFO mapred.JobClient:Running job:job_201204030722_0002
12/04/03 07:29:36 INFO mapred.JobClient:map 0%reduce 0%
12 / 04/03 07:29:41 INFO mapred.JobClient:任务ID:attempt_201204030722_0002_m_000002_0,状态:FAILED
初始化尝试错误_201204030722_0002_m_000002_0:
ENOENT:在org.apache没有这样的文件或目录
。 hadoop.io.nativeio.NativeIO.chmod(Native Method)
在org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:692)
在org.apache.hadoop.fs.FileUtil .setPermission(FileUtil.java:647)
在org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
在org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem .java:344)
在org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:239)
在org.apache.hadoop。 mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:196)
在org.apache.hadoop.mapred.TaskTracker $ 4.run(TaskTracker.java:1226)
在java.security.AccessController.doPrivileged(本机方法)
在javax.security.auth.Subject.doAs(Subject.java:416)
在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
在org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1201)
在org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1116)
在org.apache .hadoop.mapred.TaskTracker $ 5.run(TaskTracker.java:2404)
在java.lang.Thread.run(Thread.java:636)

12/04/03 07: 29:41 WARN mapred.JobClient:读取任务outputhttp:// localhost:50060 / tasklog?plaintext = true&tryid = attempts_201204030722_0002_m_000002_0& filter = stdout
12/04/03 07:29:41 WARN mapred.JobClient :读取任务outputhttp:// localhost:50060 / tasklog?plaintext = true&tryid = attempts_201204030722_0002_m _000002_0& filter = stderr

不幸的是,它只说:ENOENT:没有这样的文件或目录它没有说明它实际尝试访问的目录。 Ping localhost作品,输入目录确实存在。 jar的位置也是正确的。



任何人可以给我一个关于如何解决这个错误的指针,或者如何找出Hadoop试图访问哪个文件? / p>

我在Hadoop邮件列表中发现了几个类似的问题,但没有回复那些...



谢谢!



PS mapred.local.dir的配置看起来像这样(在mapred-site.xml中):

 < property> 
< name> mapred.local.dir< / name>
< value> / home / hadoopmachine / hadoop_data / mapred< / value>
< final> true< / final>
< / property>

根据要求,ps auxww | grep TaskTracker是:

  1000 4249 2.2 0.8 1181992 30176? Sl 12:09 0:00 
/ usr / lib / jvm / java-6-openjdk / bin / java -Dproc_tasktracker -Xmx1000m -Dhadoop.log.dir = / home / hadoopmachine / hadoop-1.0.1 / libexec /../logs
-Dhadoop.log.file = hadoop-hadoopmachine-tasktracker-debian.log -Dhadoop.home.dir = / home / hadoopmachine / hadoop-1.0.1 / libexec / ..
-Dhadoop.id.str = hadoopmachine -Dhadoop.root.logger = INFO,DRFA -Dhadoop.security.logger = INFO,NullAppender
-Djava.library.path = / home / hadoopmachine / hadoop-1.0.1 /libexec/../lib/native/Linux-i386-32
-Dhadoop.policy.file = hadoop-policy.xml -classpath [ommitted very long list of jars] org.apache.hadoop.mapred.TaskTracker


解决方案

从作业跟踪器中,确定哪个hadoop节点此任务执行。 SSH到该节点并确定 hadoop.log.dir 目录的位置(检查此节点的mapred-site.xml) - 我的猜测是hadoop用户不具有在此文件夹中创建子目录的正确权限



正在尝试创建的实际文件夹位于$ {hadoop.log.dir} / userlogs文件夹下 - 检查这个文件夹具有正确的权限



在你的情况下,看ps输出,我猜这是你需要检查的权限的文件夹:

  /home/hadoopmachine/hadoop-1.0.1/libexec /../ logs 
/ pre>

I am finally able to start a map-reduce job on Hadoop (running on a single debian machine). However, the map reduce job always fails with the following error:

hadoopmachine@debian:~$ ./hadoop-1.0.1/bin/hadoop jar hadooptest/main.jar nl.mydomain.hadoop.debian.test.Main /user/hadoopmachine/input /user/hadoopmachine/output
Warning: $HADOOP_HOME is deprecated.

12/04/03 07:29:35 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
****hdfs://localhost:9000/user/hadoopmachine/input
12/04/03 07:29:35 INFO input.FileInputFormat: Total input paths to process : 1
12/04/03 07:29:35 INFO mapred.JobClient: Running job: job_201204030722_0002
12/04/03 07:29:36 INFO mapred.JobClient:  map 0% reduce 0%
12/04/03 07:29:41 INFO mapred.JobClient: Task Id : attempt_201204030722_0002_m_000002_0, Status : FAILED
Error initializing attempt_201204030722_0002_m_000002_0:
ENOENT: No such file or directory
at org.apache.hadoop.io.nativeio.NativeIO.chmod(Native Method)
at org.apache.hadoop.fs.FileUtil.execSetPermission(FileUtil.java:692)
at org.apache.hadoop.fs.FileUtil.setPermission(FileUtil.java:647)
at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:509)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:344)
at org.apache.hadoop.mapred.JobLocalizer.initializeJobLogDir(JobLocalizer.java:239)
at org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:196)
at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1226)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1201)
at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1116)
at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2404)
at java.lang.Thread.run(Thread.java:636)

12/04/03 07:29:41 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201204030722_0002_m_000002_0&filter=stdout
12/04/03 07:29:41 WARN mapred.JobClient: Error reading task outputhttp://localhost:50060/tasklog?plaintext=true&attemptid=attempt_201204030722_0002_m_000002_0&filter=stderr

Unfortunately, it only says: "ENOENT: No such file or directory", it doesn't say what directory it actually tries to access. Pinging localhost works, and the input directory does exist. The jar location is also correct.

Can anybody give me a pointer on how to fix for this error, or how to find out which file Hadoop is trying to access?

I found several similar problems on the Hadoop mailing list, but no responses on those...

Thanks!

P.S. The config for mapred.local.dir looks like this (in mapred-site.xml):

<property>
  <name>mapred.local.dir</name>
  <value>/home/hadoopmachine/hadoop_data/mapred</value>
  <final>true</final>
</property>

As requested, the output of ps auxww | grep TaskTracker is:

1000      4249  2.2  0.8 1181992 30176 ?       Sl   12:09   0:00
/usr/lib/jvm/java-6-openjdk/bin/java -Dproc_tasktracker -Xmx1000m -Dhadoop.log.dir=/home/hadoopmachine/hadoop-1.0.1/libexec/../logs
-Dhadoop.log.file=hadoop-hadoopmachine-tasktracker-debian.log -Dhadoop.home.dir=/home/hadoopmachine/hadoop-1.0.1/libexec/.. 
-Dhadoop.id.str=hadoopmachine -Dhadoop.root.logger=INFO,DRFA -Dhadoop.security.logger=INFO,NullAppender
-Djava.library.path=/home/hadoopmachine/hadoop-1.0.1/libexec/../lib/native/Linux-i386-32 
-Dhadoop.policy.file=hadoop-policy.xml -classpath [ommitted very long list of jars] org.apache.hadoop.mapred.TaskTracker

解决方案

From the job tracker, identify which hadoop node this task executed on. SSH to that node and identify the location of the hadoop.log.dir directory (check the mapred-site.xml for this node) - my guess is the hadoop user does not have the correct permissions to create sub-directories in this folder

The actual folder it's trying to create lies under the ${hadoop.log.dir}/userlogs folder - check this folder has the correct permissions

In your case, looking at the ps output, i'm guessing this is the folder you need to examine the permission of:

/home/hadoopmachine/hadoop-1.0.1/libexec/../logs

这篇关于Hadoop map-reduce操作在写输出失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆