如何为hadoop文件系统上的Java程序设置类路径 [英] how to set classpath for a Java program on hadoop file system
问题描述
我想弄清楚如何设置引用HDFS的类路径?我找不到任何参考。
java -cp如何引用HDFS? com.MyProgram
如果我不能引用hadoop文件系统,那么我必须复制所有引用的第三个在每个hadoop机器上的$ HADOOP_HOME下方的libs / jars ...但我想通过将文件放到hadoop文件系统来避免这种情况。这可能吗?
运行程序的示例hadoop命令行(我的期望是这样的,也许我错了):
hadoop jar $ HADOOP_HOME / contrib / streaming / hadoop-streaming-1.0.3.jar -input inputfileDir -output outputfileDir -mapper /home/nanshi/myprog.java - reducer NONE -file /home/nanshi/myprog.java
但是,在上面的命令行中,如何我添加了java classpath?像
-cp/home/nanshi/wiki/Lucene/lib/lucene-core-3.6.0.jar:/home/nanshi/Lucene/bin
我想你要做的是在分布式程序中包含第三方库。选项1)我找到的最简单的选项是将所有的jar放在$ HADOOP_HOME / lib中(例如/ usr / local / hadoop -0.22.0 / lib)目录,并重新启动你的jobtracker和tasktracker。
选项2)使用libjars选项命令为
hadoop jar - libjars comma_seperated_jars
选项3)将jar包含在jar的lib目录中。您必须在创建jar时执行此操作。
选项4)在计算机中安装所有罐子,并将其位置包含在课程路径中。
选项5)您可以尝试将这些jar放入分布式缓存中。
I am trying to figure out how to set class path that reference to HDFS? I cannot find any reference.
java -cp "how to reference to HDFS?" com.MyProgram
If i cannot reference to hadoop file system, then i have to copy all the referenced third party libs/jars somewhere under $HADOOP_HOME on each hadoop machine...but i wanna avoid this by putting files to hadoop file system. Is this possible?
Example hadoop command line for the program to run (my expectation is like this, maybe i am wrong):
hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-1.0.3.jar -input inputfileDir -output outputfileDir -mapper /home/nanshi/myprog.java -reducer NONE -file /home/nanshi/myprog.java
However, within the command line above, how do i added java classpath? like -cp "/home/nanshi/wiki/Lucene/lib/lucene-core-3.6.0.jar:/home/nanshi/Lucene/bin"
What I suppose you are trying to do is include third party libraries in your distributed program. There are many options you can do.
Option 1) Easiest option that I find is to put all the jars in $HADOOP_HOME/lib (eg /usr/local/hadoop-0.22.0/lib) directory on all nodes and restart your jobtracker and tasktracker.
Option 2) Use libjars option command for this is hadoop jar -libjars comma_seperated_jars
Option 3) Include the jars in lib directory of the jar. You will have to do that while creating your jar.
Option 4) Install all the jars in your computer and include their location in class path.
Option 5) You can try by putting those jars in distributed cache.
这篇关于如何为hadoop文件系统上的Java程序设置类路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!