Java类文件的Hadoop Streaming -file选项存在问题 [英] Problem with Hadoop Streaming -file option for Java class files

查看:120
本文介绍了Java类文件的Hadoop Streaming -file选项存在问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



首先,我尝试了流媒体中的一个非常基本的例子:



hadoop @ ubuntu:/ usr / local / hadoop $ bin / hadoop jar
contrib / streaming / hadoop-streaming-0.20.203.0.jar -mapper
org.apache.hadoop.mapred.lib.IdentityMapper \ -reducer / bin / wc
-inputformat KeyValueTextInputFormat -input gutenberg / * -output
gutenberg-outputtstchk22



其中工作得很好。

然后我复制IdentityMapper.java源代码并编译它。
然后我将这个类文件放在/ home / hadoop文件夹中,并在终端中执行



hadoop @ ubuntu:/ usr / local / hadoop $ bin / hadoop jar
contrib / streaming / hadoop-streaming-0.20.203.0.jar -file
/ IdentityMapper.class -mapper IdentityMapper.class \ -reducer / bin / wc
-inputformat KeyValueTextInputFormat -input gutenberg / * -output
gutenberg -outputtstch6


$ b 执行失败,在stderr文件中出现以下错误:



java.io.IOException:无法运行程序IdentityMapper.class:
java.io.IOException:error = 2,没有这样的文件或目录



然后再次通过复制
hadoop安装中的IdentityMapper.class文件并执行以下操作来尝试:

hadoop @ ubuntu:/ usr / local / hadoop $ bin / hadoop jar
contrib / streaming / hadoop-streaming-0.20.203.0.jar -file
IdentityMapper.class -mapper IdentityMapper.class \ -reducer / bin / wc
-inp utformat KeyValueTextInputFormat -input gutenberg / * -output
gutenberg -outputtstch5



但不幸的是我再次遇到同样的错误。

如果你可以帮助我,那将是非常好的,因为我不能在没有克服这个问题的情况下继续移动任何



感谢您的期望。

解决方案

你为什么要编译这个类?它已经被编译在hadoop罐子里。您只是传递类名(org.apache.hadoop.mapred.lib.IdentityMapper),因为Hadoop使用反射来实例化此映射类的新实例。



您必须确保它位于类路径中,例如在一个罐子里,你正在通过这项工作。


I am struggling with a very basic issue in hadoop streaming in the "-file" option.

First I tried the very basic example in streaming:

hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper org.apache.hadoop.mapred.lib.IdentityMapper \ -reducer /bin/wc -inputformat KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstchk22

which worked absolutely fine.

Then I copied the IdentityMapper.java source code and compiled it. Then I placed this class file in the /home/hadoop folder and executed the following in the terminal.

hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-streaming-0.20.203.0.jar -file ~/IdentityMapper.class -mapper IdentityMapper.class \ -reducer /bin/wc -inputformat KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch6

The execution failed with the following error in the stderr file:

java.io.IOException: Cannot run program "IdentityMapper.class": java.io.IOException: error=2, No such file or directory

Then again I tried it by copying the IdentityMapper.class file in the hadoop installation and executed the following:

hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-streaming-0.20.203.0.jar -file IdentityMapper.class -mapper IdentityMapper.class \ -reducer /bin/wc -inputformat KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch5

But unfortunately again I got the same error.

It would be great if you can help me with it as I cannot move any further without overcoming this.

Thanking you in anticipation.

解决方案

Why do you want to compile the class? It is already compiled in the hadoop jars. You are just passing the classname (org.apache.hadoop.mapred.lib.IdentityMapper), because Hadoop uses reflection to instantiate a new instance of this mapping class.

You have to make sure that this is lying in the classpath e.g. within a jar you are passing the job.

这篇关于Java类文件的Hadoop Streaming -file选项存在问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆