Hadoop HADOOP_CLASSPATH问题 [英] Hadoop HADOOP_CLASSPATH issues

查看:3332
本文介绍了Hadoop HADOOP_CLASSPATH问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题并不是指在整个集群中分配工人使用它们的罐子。

它指的是在客户端机器上指定一些额外的库。更具体地说:我试图运行以下命令来检索SequenceFile的内容:

  / path / to / hadoop / script fs -text / path / in / HDFS / to / my / file 

它引发了这个错误: text:java.io.IOException:WritableName无法加载类:util.io.DoubleArrayWritable



我有一个名为DoubleArrayWritable的可写类。事实上,在另一台电脑上一切正常。



我尝试设置 HADOOP_CLASSPATH 但没有结果。实际上,在运行时:

  / path / to / hadoop / script classpath 

结果不包含我添加到HADOOP_CLASSPATH的jar。



问题是:如何在运行hadoop时指定额外的库(额外含义是指其他库,而不是hadoop脚本自动包含在classpath中的库)

更多信息可能有所帮助:


  • 我无法修改hadoop.sh脚本(也没有任何关联的脚本)
  • 无法将我的库复制到hadoop安装目录下的/ lib目录下

  • 在hadoop.sh中运行的hadoop-env.sh中有以下行: export HADOOP_CLASSPATH = $ HADOOP_HOME / lib 这可能解释了为什么我的HADOOP_CLASSPATH env var被忽略。

解决方案

如果您可以设置 HADOOP_CLASSPATH ,那么

  export HADOOP_CLASSPATH = / path / to / jar / myjar.jar:$ HADOOP_CLASSPATH; \ 
hadoop fs -text / path / in / HDFS / to / my / file

将完成这项工作。因为在你的情况下,这个变量在 hadoop-env.sh 中被覆盖,所以可以考虑使用 -libjars 选项改为:

  hadoop fs -libjars /path/to/jar/myjar.jar -text / path / in / HDFS / to / my / file 

或者调用 FsShell 手动:

  java -cp $ HADOOP_HOME / lib / *:/ path / to / jar / myjar.jar :$ CLASSPATH \ 
org.apache.hadoop.fs.FsShell -conf $ HADOOP_HOME / conf / core-site.xml \
-text / path / in / HDFS / to / my / file


This question doesn't refer to distributing jars in the whole cluster for the workers to use them.

It refers to specifying a number of additional libraries on the client machine. To be more specific: I'm trying to run the following command in order to retrieve the contents of a SequenceFile:

   /path/to/hadoop/script fs -text /path/in/HDFS/to/my/file

It throws me this error: text: java.io.IOException: WritableName can't load class: util.io.DoubleArrayWritable

I have a writable class called DoubleArrayWritable. In fact , on another computer everything works well.

I tried to set the HADOOP_CLASSPATH to include the jar containing that class but with no results. Actually, when running:

   /path/to/hadoop/script classpath 

The result doesn't contain the jar which I added to HADOOP_CLASSPATH.

The question is: how do you specify extra libraries when running hadoop (by extra meaning other libraries than the ones which the hadoop script includes automatically in the classpath)

Some more info which might help:

  • I can't modify the hadoop.sh script (nor any associated scripts)
  • I can't copy my library to the /lib directory under the hadoop installation directory
  • In the hadoop-env.sh which is run from the hadoop.sh there is this line: export HADOOP_CLASSPATH=$HADOOP_HOME/lib which probably explains why my HADOOP_CLASSPATH env var is ignored.

解决方案

If you are allowed to set HADOOP_CLASSPATH then

export HADOOP_CLASSPATH=/path/to/jar/myjar.jar:$HADOOP_CLASSPATH; \
    hadoop fs -text /path/in/HDFS/to/my/file

will do the job. Since in your case this variable is overridden in hadoop-env.sh therefore, consider using the -libjars option instead:

hadoop fs -libjars /path/to/jar/myjar.jar -text /path/in/HDFS/to/my/file

Alternatively invoke FsShell manually:

java -cp $HADOOP_HOME/lib/*:/path/to/jar/myjar.jar:$CLASSPATH \
org.apache.hadoop.fs.FsShell -conf $HADOOP_HOME/conf/core-site.xml \
-text /path/in/HDFS/to/my/file

这篇关于Hadoop HADOOP_CLASSPATH问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆