Hadoop Streaming - 无法找到文件错误 [英] Hadoop Streaming - Unable to find file error

查看:327
本文介绍了Hadoop Streaming - 无法找到文件错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图运行一个hadoop流式Python作业。

  bin / hadoop jar contrib / streaming / hadoop-0.20.1-streaming.jar 
-D stream.non .zero.exit.is.failure = true
-input / ixml
-output / oxml
-mapper scripts / mapper.py
-file scripts / mapper.py
-inputreaderStreamXmlRecordReader,begin = channel,end = / channel
-jobconf mapred.reduce.tasks = 0

我确信mapper.py具有所有权限。它错误地指出:

 引起:java.io.IOException:无法运行程序mapper.py:
错误= 2,没有这样的文件或目录
在java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
在org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214 )
... 19 more
导致:java.io.IOException:error = 2,没有这样的文件或目录$ b $在java.lang.UNIXProcess.forkAndExec(Native方法)$ b (UNIXProcess.java:53)$ b $在java.lang.ProcessImpl.start(ProcessImpl.java:91)
在java.lang.ProcessBuilder.start(ProcessBuilder。 java:453)

我尝试将mapper.py复制到hdfs并给出相同的hdfs:// localhost /mapper.py链接,这不起作用!关于如何解决这个bug的任何想法?

wiki.apache.org/hadoop/HadoopStreamingrel =noreferrer> HadoopStreaming wiki页面,似乎您应该更改

  -mapper scripts / mapper.py 
-file scripts / mapper.py

  -mapper mapper.py 
- 文件脚本/ mapper.py

因为发货文件进入工作目录。您可能还需要直接指定python解释器:

  -mapper / path / to / python mapper.py 
-file scripts / mapper.py


I am trying to run a hadoop-streaming python job.

bin/hadoop jar contrib/streaming/hadoop-0.20.1-streaming.jar 
-D stream.non.zero.exit.is.failure=true 
-input /ixml 
-output /oxml 
-mapper scripts/mapper.py 
-file scripts/mapper.py 
-inputreader "StreamXmlRecordReader,begin=channel,end=/channel" 
-jobconf mapred.reduce.tasks=0 

I made sure mapper.py has all the permissions. It errors out saying

Caused by: java.io.IOException: Cannot run program "mapper.py":     
error=2, No such file or directory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
    at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
... 19 more
Caused by: java.io.IOException: error=2, No such file or directory
    at java.lang.UNIXProcess.forkAndExec(Native Method)
    at java.lang.UNIXProcess.(UNIXProcess.java:53)
    at java.lang.ProcessImpl.start(ProcessImpl.java:91)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)

I tried copying mapper.py to hdfs and give the same hdfs://localhost/mapper.py link, that does not work too! Any thoughts on how to fix this bug?.

解决方案

Looking at the example on the HadoopStreaming wiki page, it seems that you should change

-mapper scripts/mapper.py 
-file scripts/mapper.py 

to

-mapper mapper.py 
-file scripts/mapper.py 

since "shipped files go to the working directory". You might also need to specify the python interpreter directly:

-mapper /path/to/python mapper.py 
-file scripts/mapper.py 

这篇关于Hadoop Streaming - 无法找到文件错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆