Spark上的Hive 2.1.1 - 我应该使用哪个版本的Spark [英] Hive 2.1.1 on Spark - Which version of Spark should I use
问题描述
我在Ubuntu 16.04上运行 hive 2.1.1 ,hadoop 2.7.3。 根据 Hive on Spark:Getting Started ,它说
安装/构建兼容版本。 Hive root pom.xml的
定义了它构建/测试的
的Spark版本。
我检查过
< spark.version> 1.6.0< / p> spark.version>
在Spark 2.0之前:./make-distribution.sh - 名称
hadoop2-without-hive--tgz
--Pyarn,hadoop提供,hadoop- 2.4,parquet-provided
由于Spark
2.0.0:./dev/make-distribution.sh --namehadoop2-without-hive --tgz--Pyarn,hadoop提供,hadoop-2.7,实木复合地提供
所以现在我很困惑,因为我我正在运行hadoop 2.7.3。我必须将我的hadoop降级到2.4吗?
我应该使用哪个版本的Spark? 1.6.0或2.0.0?
谢谢!
JavaSparkListener不可用,Hive执行崩溃
https: //issues.apache.org/jira/browse/SPARK-17563
您可以尝试使用Hadoop 2.7和Spark 1.6构建Hive 2.1:
./ make-distribution.sh - 名称hadoop2-without-hive--tgz--Pyarn,hadoop-provided ,hadoop-2.7,parquet-provided
如果您在2.0之后查看命令,是./make-distribution在文件夹/ dev中。
如果它不适用于hadoop 2.7.X,我可以证实你已经能够使用Hadoop 2.6成功构建它:
./ mak e-distribution.sh --namehadoop2-without-hive--tgz-Pyarn,hadoop-hadoop-2.6,parquet-provided
和scala 2.10.5
I'm running hive 2.1.1, hadoop 2.7.3 on Ubuntu 16.04.
According to Hive on Spark: Getting Started , it says
Install/build a compatible version. Hive root pom.xml's defines what version of Spark it was built/tested with.
I checked the pom.xml, it shows that spark version is 1.6.0.
<spark.version>1.6.0</spark.version>
But Hive on Spark: Getting Started also says that
Prior to Spark 2.0.0: ./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.4,parquet-provided"
Since Spark 2.0.0: ./dev/make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided"
So now I'm confused because I am running hadoop 2.7.3. Do I have to downgrade my hadoop to 2.4?
Which version of Spark should I use? 1.6.0 or 2.0.0?
Thank you!
The current version of Spark 2.X is not compatible with Hive 2.1 and Hadoop 2.7, there is a major bug:
JavaSparkListener is not available and Hive crash on execution
https://issues.apache.org/jira/browse/SPARK-17563
You can try to build Hive 2.1 with Hadoop 2.7 and Spark 1.6 with:
./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided"
If you take a look to the command after 2.0 the difference is that ./make-distribution is inside the folder /dev.
If it does not work for hadoop 2.7.X, I can confirm you that I have been able to successfully built it with Hadoop 2.6, by using:
./make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.6,parquet-provided"
and for scala 2.10.5
这篇关于Spark上的Hive 2.1.1 - 我应该使用哪个版本的Spark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!