通过 Thrift Server 访问 Spark SQL RDD 表 [英] Accessing Spark SQL RDD tables through the Thrift Server

查看:34
本文介绍了通过 Thrift Server 访问 Spark SQL RDD 表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经用 Spark SQL 注册了一个临时表,如 [本节]:

I have registered a temporary table with Spark SQL, as described in [this section]:

people.registerTempTable("people")
// I can run queries on it all right.
val teenagers = sqlContext.sql("SELECT name FROM people WHERE age >= 13 AND age <= 19")

现在我想通过JDBC远程访问这个表.我按照 [其他部分].

Now I want to access this table remotely through JDBC. I start up the Thrift Server as described in [this other section].

./sbin/start-thriftserver.sh --master spark://same-master-as-above:7077

但是表格不可见.

0: jdbc:hive2://localhost:10000> show tables;         
+---------+
| result  |
+---------+
+---------+
No rows selected (2.216 seconds)

我猜这是因为该表是临时的"(即与 SqlContext 对象的生命周期相关).但是如何制作非临时表?

I guess this is because the table is "temporary" (i.e. tied to the lifetime of the SqlContext object). But how do I make non-temporary tables?

我可以通过 Thrift Server 查看Hive 表,但我不知道如何公开这样的 RDD.我发现 评论 暗示我不能.

I can see Hive tables through the Thrift Server, but I don't see how I could expose an RDD like this. I've found a comment that suggests I cannot.

或者我应该使用我自己的 SqlContext 在我的应用程序中运行 Thrift Server?几乎所有围绕它的类都是 private,并且这段代码不在 Maven Central 中(据我所知).我应该使用 HiveThriftServer2.startWithContext 吗?它没有记录并且 @DeveloperApi,但可能有效.

Or should I run the Thrift Server in my application with my own SqlContext? Almost all classes around it are private, and this code is not in Maven Central (as far as I see). Am I supposed to use HiveThriftServer2.startWithContext? It's undocumented and @DeveloperApi, but might work.

推荐答案

修改 spark-defaults.conf 并添加 spark.sql.hive.thriftServer.singleSession true.

这允许 Thrift 服务器直接基于 RDD 查看临时表,而无需保存该表.您还可以在 Spark SQL 中执行 CACHE TABLE XXX AS 并通过 ODBC/JDBC 公开它.

This allows the Thrift server to see temp tables based directly on RDD without having to save the table. You can also do CACHE TABLE XXX AS <query> in Spark SQL and have it be exposed via ODBC/JDBC.

这篇关于通过 Thrift Server 访问 Spark SQL RDD 表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆