在带有配置单元的无头模式下在 HDP 3.1 上触发 3.x - 找不到配置单元表 [英] spark 3.x on HDP 3.1 in headless mode with hive - hive tables not found

查看：25 发布时间：2021/11/14 23:20:41 apache-spark hive apache-spark-sql hortonworks-data-platform hive-metastore

本文介绍了在带有配置单元的无头模式下在 HDP 3.1 上触发 3.x - 找不到配置单元表的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何使用无头 (https) 在 HDP 3.1 上配置 Spark 3.x://spark.apache.org/docs/latest/hadoop-provided.html) 与 hive 交互的 spark 版本?

How can I configure Spark 3.x on HDP 3.1 using headless (https://spark.apache.org/docs/latest/hadoop-provided.html) version of spark to interact with hive?

首先，我已经下载并解压了 headless spark 3.x:

First, I have downloaded and unzipped the headless spark 3.x:

cd ~/development/software/spark-3.0.0-bin-without-hadoop
export HADOOP_CONF_DIR=/etc/hadoop/conf/
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
export SPARK_DIST_CLASSPATH=$(hadoop --config /usr/hdp/current/spark2-client/conf classpath)
 
ls /usr/hdp # note version ad add it below and replace 3.1.x.x-xxx with it

./bin/spark-shell --master yarn --queue myqueue --conf spark.driver.extraJavaOptions='-Dhdp.version=3.1.x.x-xxx' --conf spark.yarn.am.extraJavaOptions='-Dhdp.version=3.1.x.x-xxx' --conf spark.hadoop.metastore.catalog.default=hive --files /usr/hdp/current/hive-client/conf/hive-site.xml

spark.sql("show databases").show
// only showing default namespace, existing hive tables are missing
+---------+
|namespace|
+---------+
|  default|
+---------+

spark.conf.get("spark.sql.catalogImplementation")
res2: String = in-memory # I want to see hive here - how? How to add hive jars onto the classpath?

注意

这是如何在 HDP 上的自定义版本中以无头模式运行 spark? 对于 Spark 3.x 和 HDP 3.1 和自定义 spark 在 yarn 上运行时找不到 hive 数据库.

NOTE

This is an updated version of How can I run spark in headless mode in my custom version on HDP? for Spark 3.x ond HDP 3.1 and custom spark does not find hive databases when running on yarn.

此外:我知道 spark 中 ACID hive 表的问题.现在，我只想能够看到现有的数据库

Furthermore: I am aware of the problems of ACID hive tables in spark. For now, I simply want to be able to see the existing databases

我们必须将 hive jar 放到类路径上.尝试如下:

We must get the hive jars onto the class path. Trying as follows:

 export SPARK_DIST_CLASSPATH="/usr/hdp/current/hive-client/lib*:${SPARK_DIST_CLASSPATH}"

现在使用 spark-sql:

And now using spark-sql:

./bin/spark-sql --master yarn --queue myqueue--conf spark.driver.extraJavaOptions='-Dhdp.version=3.1.x.x-xxx' --conf spark.yarn.am.extraJavaOptions='-Dhdp.version=3.1.x.x-xxx' --conf spark.hadoop.metastore.catalog.default=hive --files /usr/hdp/current/hive-client/conf/hive-site.xml

失败:

Error: Failed to load class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.
Failed to load main class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.

即行:export SPARK_DIST_CLASSPATH=/usr/hdp/current/hive-client/lib*:${SPARK_DIST_CLASSPATH}"，没有效果(如果没有设置同样的问题).

I.e. the line: export SPARK_DIST_CLASSPATH="/usr/hdp/current/hive-client/lib*:${SPARK_DIST_CLASSPATH}", had no effect (same issue if not set).

在带有配置单元的无头模式下在 HDP 3.1 上触发 3.x - 找不到配置单元表 [英] spark 3.x on HDP 3.1 in headless mode with hive - hive tables not found

问题描述

注意

NOTE

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在带有配置单元的无头模式下在 HDP 3.1 上触发 3.x - 找不到配置单元表 [英] spark 3.x on HDP 3.1 in headless mode with hive - hive tables not found

问题描述

注意

NOTE

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭