如何在Cloudera hadoop中使用外部罐子? [英] how to use external jars in Cloudera hadoop?
问题描述
我的集群上安装了cloudera hadoop版本4。
它包含谷歌protobuffer jar版本2.4。
在我的应用程序代码中,我使用了protobuffer 2.5版编译的protobuffer类。
这会在运行时导致未解决的编译问题。
有没有办法使用外部jar来运行map reduce作业,或者我坚持到cloudera升级他们的服务?
谢谢。
是的,您可以使用外部jar运行MR作业。
依照以下示例提交作业时 HADOOP_CLASSPATH
和 -libjars
的依赖关系:
您可以使用以下命令从当前目录和 lib
目录添加所有jar依赖关系:
export HADOOP_CLASSPATH = $ HADOOP_CLASSPATH:`echo * .jar`:`echo lib / *。jar | sed's / /:/ g'`
请注意,当通过 hadoop jar
您还需要通过使用 -libjars
将它传递给任何依赖关系的罐子。我喜欢使用:
hadoop jar< jar> <类> -libjars`echo ./lib/*.jar | sed's / /,/ g'` [args ...]
sed
命令需要不同的分隔符; HADOOP_CLASSPATH
为:
分隔, -libjars
需要编辑:如果你需要你的类路径首先被解释,以确保你的jar(而不是)。 您可以设置以下内容:
export HADOOP_USER_CLASSPATH_FIRST = true
i have a cloudera hadoop version 4 installed on my cluster. It comes packaged with google protobuffer jar version 2.4. in my application code i use protobuffer classes compiled with protobuffer version 2.5.
This causes unresolved compilation problems at run time. Is there a way to run the map reduce jobs with an external jar or am i stuck until cloudera upgrades their service?
Thanks.
Yes you can run MR jobs with external jars.
Be sure to add any dependencies to both the HADOOP_CLASSPATH
and -libjars
upon submitting a job like in the following examples:
You can use the following to add all the jar dependencies from current and lib
directories:
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`
Bear in mind that when starting a job through hadoop jar
you'll need to also pass it the jars of any dependencies through use of -libjars
. I like to use:
hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]
NOTE: The sed
commands require a different delimiter character; the HADOOP_CLASSPATH
is :
separated and the -libjars
need to be ,
separated.
EDIT: If you need your classpath to be interpreted first to ensure your jar (and not the pre-packaged jar) is the one that gets used, you can set the following:
export HADOOP_USER_CLASSPATH_FIRST=true
这篇关于如何在Cloudera hadoop中使用外部罐子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!