如何在Cloudera hadoop中使用外部罐子? [英] how to use external jars in Cloudera hadoop?

查看:106
本文介绍了如何在Cloudera hadoop中使用外部罐子?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的集群上安装了cloudera hadoop版本4。
它包含谷歌protobuffer jar版本2.4。
在我的应用程序代码中,我使用了protobuffer 2.5版编译的protobuffer类。



这会在运行时导致未解决的编译问题。
有没有办法使用外部jar来运行map reduce作业,或者我坚持到cloudera升级他们的服务?



谢谢。

解决方案

是的,您可以使用外部jar运行MR作业。

依照以下示例提交作业时 HADOOP_CLASSPATH -libjars 的依赖关系:



您可以使用以下命令从当前目录和 lib 目录添加所有jar依赖关系:

  export HADOOP_CLASSPATH = $ HADOOP_CLASSPATH:`echo * .jar`:`echo lib / *。jar | sed's / /:/ g'` 

请注意,当通过 hadoop jar 您还需要通过使用 -libjars 将它传递给任何依赖关系的罐子。我喜欢使用:

  hadoop jar< jar> <类> -libjars`echo ./lib/*.jar | sed's / /,/ g'` [args ...] 

sed 命令需要不同的分隔符; HADOOP_CLASSPATH 分隔, -libjars 需要编辑:如果你需要你的类路径首先被解释,以确保你的jar(而不是)。 您可以设置以下内容:



export HADOOP_USER_CLASSPATH_FIRST = true


i have a cloudera hadoop version 4 installed on my cluster. It comes packaged with google protobuffer jar version 2.4. in my application code i use protobuffer classes compiled with protobuffer version 2.5.

This causes unresolved compilation problems at run time. Is there a way to run the map reduce jobs with an external jar or am i stuck until cloudera upgrades their service?

Thanks.

解决方案

Yes you can run MR jobs with external jars.

Be sure to add any dependencies to both the HADOOP_CLASSPATH and -libjars upon submitting a job like in the following examples:

You can use the following to add all the jar dependencies from current and lib directories:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:`echo *.jar`:`echo lib/*.jar | sed 's/ /:/g'`

Bear in mind that when starting a job through hadoop jar you'll need to also pass it the jars of any dependencies through use of -libjars. I like to use:

hadoop jar <jar> <class> -libjars `echo ./lib/*.jar | sed 's/ /,/g'` [args...]

NOTE: The sed commands require a different delimiter character; the HADOOP_CLASSPATH is : separated and the -libjars need to be , separated.

EDIT: If you need your classpath to be interpreted first to ensure your jar (and not the pre-packaged jar) is the one that gets used, you can set the following:

export HADOOP_USER_CLASSPATH_FIRST=true

这篇关于如何在Cloudera hadoop中使用外部罐子?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆