Spark 2:调用SparkSession enableHiveSupport()时如何工作 [英] Spark 2: how does it work when SparkSession enableHiveSupport() is invoked

查看:2448
本文介绍了Spark 2:调用SparkSession enableHiveSupport()时如何工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题很简单,但是以某种方式我无法通过阅读文档找到明确的答案.

My question is rather simple, but somehow I cannot find a clear answer by reading the documentation.

我有 Spark2 CDH 5.10 集群上运行. 还有Hive和一个Metastore.

I have Spark2 running on a CDH 5.10 cluster. There is also Hive and a metastore.

我在Spark程序中创建一个会话,如下所示:

I create a session in my Spark program as follows:

SparkSession spark = SparkSession.builder().appName("MyApp").enableHiveSupport().getOrCreate()

假设我有以下HiveQL查询:

Suppose I have the following HiveQL query:

spark.sql("SELECT someColumn FROM someTable")

我想知道是否:

  1. 在后台将这个查询翻译为Hive MapReduce原语,或
  2. 对HiveQL的支持仅在语法上,Spark SQL将在后台使用.

我正在做一些性能评估,我不知道是否应该声明用spark.sql([hiveQL query])执行的查询的时间性能,这些性能是指Spark或Hive.

I am doing some performance evaluation and I don't know whether I should claim the time performance of queries executed with spark.sql([hiveQL query]) refer to Spark or Hive.

推荐答案

Spark知道两个目录,hive和内存中.如果设置enableHiveSupport(),则spark.sql.catalogImplementation设置为hive,否则设置为in-memory.因此,如果启用hive支持,spark.catalog.listTables().show()将为您显示hive元存储中的所有表.

Spark knows two catalogs, hive and in-memory. If you set enableHiveSupport(), then spark.sql.catalogImplementation is set to hive, otherwise to in-memory. So if you enable hive support, spark.catalog.listTables().show() will show you all tables from the hive metastore.

但这并不意味着hive用于查询*,它仅表示spark与hive-metastore通信,执行引擎始终是spark.

But this does not mean hive is used for the query*, it just means that spark communicates with the hive-metastore, the execution engine is always spark.

*实际上有一些功能,例如percentilepercentile_approx都是本机配置单元UDAF.

*there are actually some functions like percentile und percentile_approx which are native hive UDAF.

这篇关于Spark 2:调用SparkSession enableHiveSupport()时如何工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆