如何为hadoop文件系统上的Java程序设置类路径 [英] how to set classpath for a Java program on hadoop file system

查看:114
本文介绍了如何为hadoop文件系统上的Java程序设置类路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想弄清楚如何设置引用HDFS的类路径?我找不到任何参考。

  java -cp如何引用HDFS? com.MyProgram 

如果我不能引用hadoop文件系统,那么我必须复制所有引用的第三个在每个hadoop机器上的$ HADOOP_HOME下方的libs / jars ...但我想通过将文件放到hadoop文件系统来避免这种情况。这可能吗?



运行程序的示例hadoop命令行(我的期望是这样的,也许我错了):

  hadoop jar $ HADOOP_HOME / contrib / streaming / hadoop-streaming-1.0.3.jar -input inputfileDir -output outputfileDir -mapper /home/nanshi/myprog.java  - reducer NONE -file /home/nanshi/myprog.java 

但是,在上面的命令行中,如何我添加了java classpath?像
-cp/home/nanshi/wiki/Lucene/lib/lucene-core-3.6.0.jar:/home/nanshi/Lucene/bin

解决方案

我想你要做的是在分布式程序中包含第三方库。选项1)我找到的最简单的选项是将所有的jar放在$ HADOOP_HOME / lib中(例如/ usr / local / hadoop -0.22.0 / lib)目录,并重新启动你的jobtracker和tasktracker。

选项2)使用libjars选项命令为
hadoop jar - libjars comma_seperated_jars



选项3)将jar包含在jar的lib目录中。您必须在创建jar时执行此操作。



选项4)在计算机中安装所有罐子,并将其位置包含在课程路径中。



选项5)您可以尝试将这些jar放入分布式缓存中。

I am trying to figure out how to set class path that reference to HDFS? I cannot find any reference.

 java -cp "how to reference to HDFS?" com.MyProgram 

If i cannot reference to hadoop file system, then i have to copy all the referenced third party libs/jars somewhere under $HADOOP_HOME on each hadoop machine...but i wanna avoid this by putting files to hadoop file system. Is this possible?

Example hadoop command line for the program to run (my expectation is like this, maybe i am wrong):

hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-1.0.3.jar -input inputfileDir -output outputfileDir -mapper /home/nanshi/myprog.java -reducer NONE -file /home/nanshi/myprog.java

However, within the command line above, how do i added java classpath? like -cp "/home/nanshi/wiki/Lucene/lib/lucene-core-3.6.0.jar:/home/nanshi/Lucene/bin"

解决方案

What I suppose you are trying to do is include third party libraries in your distributed program. There are many options you can do.

Option 1) Easiest option that I find is to put all the jars in $HADOOP_HOME/lib (eg /usr/local/hadoop-0.22.0/lib) directory on all nodes and restart your jobtracker and tasktracker.

Option 2) Use libjars option command for this is hadoop jar -libjars comma_seperated_jars

Option 3) Include the jars in lib directory of the jar. You will have to do that while creating your jar.

Option 4) Install all the jars in your computer and include their location in class path.

Option 5) You can try by putting those jars in distributed cache.

这篇关于如何为hadoop文件系统上的Java程序设置类路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆