Hadoop Streaming - 无法找到文件错误 [英] Hadoop Streaming - Unable to find file error
问题描述
bin / hadoop jar contrib / streaming / hadoop-0.20.1-streaming.jar
-D stream.non .zero.exit.is.failure = true
-input / ixml
-output / oxml
-mapper scripts / mapper.py
-file scripts / mapper.py
-inputreaderStreamXmlRecordReader,begin = channel,end = / channel
-jobconf mapred.reduce.tasks = 0
我确信mapper.py具有所有权限。它错误地指出:
引起:java.io.IOException:无法运行程序mapper.py:
错误= 2,没有这样的文件或目录
在java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
在org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214 )
... 19 more
导致:java.io.IOException:error = 2,没有这样的文件或目录$ b $在java.lang.UNIXProcess.forkAndExec(Native方法)$ b (UNIXProcess.java:53)$ b $在java.lang.ProcessImpl.start(ProcessImpl.java:91)
在java.lang.ProcessBuilder.start(ProcessBuilder。 java:453)
我尝试将mapper.py复制到hdfs并给出相同的hdfs:// localhost /mapper.py链接,这不起作用!关于如何解决这个bug的任何想法?
wiki.apache.org/hadoop/HadoopStreamingrel =noreferrer> HadoopStreaming wiki页面,似乎您应该更改
-mapper scripts / mapper.py
-file scripts / mapper.py
到
-mapper mapper.py
- 文件脚本/ mapper.py
因为发货文件进入工作目录。您可能还需要直接指定python解释器:
-mapper / path / to / python mapper.py
-file scripts / mapper.py
I am trying to run a hadoop-streaming python job.
bin/hadoop jar contrib/streaming/hadoop-0.20.1-streaming.jar
-D stream.non.zero.exit.is.failure=true
-input /ixml
-output /oxml
-mapper scripts/mapper.py
-file scripts/mapper.py
-inputreader "StreamXmlRecordReader,begin=channel,end=/channel"
-jobconf mapred.reduce.tasks=0
I made sure mapper.py has all the permissions. It errors out saying
Caused by: java.io.IOException: Cannot run program "mapper.py":
error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
... 19 more
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.(UNIXProcess.java:53)
at java.lang.ProcessImpl.start(ProcessImpl.java:91)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
I tried copying mapper.py to hdfs and give the same hdfs://localhost/mapper.py link, that does not work too! Any thoughts on how to fix this bug?.
Looking at the example on the HadoopStreaming wiki page, it seems that you should change
-mapper scripts/mapper.py
-file scripts/mapper.py
to
-mapper mapper.py
-file scripts/mapper.py
since "shipped files go to the working directory". You might also need to specify the python interpreter directly:
-mapper /path/to/python mapper.py
-file scripts/mapper.py
这篇关于Hadoop Streaming - 无法找到文件错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!