Hive on Spark:Missing< spark-assembly * .jar> [英] Hive on Spark: Missing <spark-assembly*.jar>

查看:338
本文介绍了Hive on Spark:Missing< spark-assembly * .jar>的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在运行Hive 2.1.1,Spark 2.1.0和Hadoop 2.7.3。

我试着按照 Hive on Spark:Getting Started


./ dev / make-distribution.sh - 名称hadoop2-without-hive--tgz
--Pyarn,hadoop-provided ,hadoop-2.7,parquet-provided


但是,我在spark目录下找不到任何spark-assembly jar文件( find。-namespark-assembly * .jar不返回任何内容)。我没有将spark-assembly jar链接到 HIVE_HOME / lib ,而是尝试了 export SPARK_HOME = / home / user / spark

我在直线中得到以下Hive错误:

  0 :jdbc:hive2:// localhost:10000>设置hive.execution.engine = spark; 
0:jdbc:hive2:// localhost:10000>插入到测试(ID,名称)值(1,'test1');
错误:运行查询时出错:java.lang.NoClassDefFoundError:scala / collection / Iterable(state =,code = 0)

我认为这个错误是由于缺少火花组装罐引起的。



我该如何构建/我在哪里可以找到那些火花组装罐如何修复上述错误?

谢谢!$ b $首先,Spark不会从2.0.0开始构建 spark-assembly.jar ,但是构建所有的依赖关系jar到目录 $ SPARK_HOME / jars

另外,Hive并不支持每个版本的Spark,实际上它具有强大的版本兼容性限制,可以在Spark上运行Hive。取决于你使用的Hive版本,你总是可以在Hive的 pom.xml 文件中找到相应的Spark版本。对于 Hive 2.1.1 在pom.xml中指定的spark版本是:


< spark .version> 1.6.0< /spark.version>


由于您已经知道您需要没有蜂巢支持的情况下构建火花我不知道为什么,但是中的命令

  mvn -Pyarn -Phadoop-2.6 -Dscala-2.11 -DskipTests干净的软件包

(希望你不会见面):



希望这会对您有所帮助,一切顺利。


I'm running Hive 2.1.1, Spark 2.1.0 and Hadoop 2.7.3.

I tried to build Spark following the Hive on Spark: Getting Started:

./dev/make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided"

However, I couldn't find any spark-assembly jar files under the spark directory (find . -name "spark-assembly*.jar" returns nothing back). Instead of linking the spark-assembly jar to HIVE_HOME/lib, I tried export SPARK_HOME=/home/user/spark.

I get the following Hive error in beeline:

0: jdbc:hive2://localhost:10000> set hive.execution.engine=spark;
0: jdbc:hive2://localhost:10000> insert into test (id, name) values (1, 'test1');
Error: Error running query: java.lang.NoClassDefFoundError: scala/collection/Iterable (state=,code=0)

I think the error is caused by missing spark-assembly jars.

How could I build / Where could I find those spark-assembly jar files?

How could I fix the above error?

Thank you!

解决方案

First of all, Spark will not build spark-assembly.jar from 2.0.0, but build all dependency jars to directory $SPARK_HOME/jars

Besides, Hive does not support every version of Spark, actually it has a strong version compatibility restrictions to run Hive on Spark. Depends on which version of Hive you're using, you can always find out the corresponding Spark version in pom.xml file of Hive. For Hive 2.1.1, the spark version specified in pom.xml is:

<spark.version>1.6.0</spark.version>

As you already know that you need to build spark without hive support. I don't know why but the command in Hive on Spark - Getting Started does not work for me, finally I succeeded with following command:

mvn -Pyarn -Phadoop-2.6 -Dscala-2.11 -DskipTests clean package

And few other troubleshooting tips which I met before(Hope you're not going to meet):

  • Starting Spark Master failed due to failed to find slf4f or hadoop related classes, run export SPARK_DIST_CLASSPATH=$(hadoop classpath) and try again
  • Failed to load snappy native libs, which is caused by that there's no snappy dependency in classpath, or the snappy lib under hadoop classpath is not the correct version for Spark. You can download a correct version of snappy lib and put it under $SPARK_HOME/lib/, and run export SPARK_DIST_CLASSPATH=$SPARK_HOME/lib/*:$(hadoop classpath) and try again.

Hope this could be helpful and everything goes well to you.

这篇关于Hive on Spark:Missing&lt; spark-assembly * .jar&gt;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆