通过节俭服务器访问SQL星火RDD表 [英] Accessing Spark SQL RDD tables through the Thrift Server

查看:176
本文介绍了通过节俭服务器访问SQL星火RDD表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经注册星火SQL临时表,如<一个描述href=\"https://spark.apache.org/docs/1.1.0/sql-programming-guide.html#inferring-the-schema-using-reflection\">[this段] :

I have registered a temporary table with Spark SQL, as described in [this section]:

people.registerTempTable("people")
// I can run queries on it all right.
val teenagers = sqlContext.sql("SELECT name FROM people WHERE age >= 13 AND age <= 19")

现在我想通过JDBC远程访问此表。我开始了节俭Server作为<一个描述href=\"https://spark.apache.org/docs/1.1.0/sql-programming-guide.html#running-the-thrift-jdbc-server\">[this另一部分] 。

Now I want to access this table remotely through JDBC. I start up the Thrift Server as described in [this other section].

./sbin/start-thriftserver.sh --master spark://same-master-as-above:7077

但该表是不可见的。

But the table is not visible.

0: jdbc:hive2://localhost:10000> show tables;         
+---------+
| result  |
+---------+
+---------+
No rows selected (2.216 seconds)

我想这是因为表是临时(即绑定到 SqlContext 对象的生命周期)。但是,如何让非临时表?

I guess this is because the table is "temporary" (i.e. tied to the lifetime of the SqlContext object). But how do I make non-temporary tables?

我可以看到蜂巢表格是通过节俭的服务器,但我不看我怎么可能使一个像这样的RDD。我发现<一个href=\"https://github.com/apache/spark/blob/42389b1780311d90499b4ce2315ceabf5b6ab384/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala#L309\">a评论这表明我不能。

I can see Hive tables through the Thrift Server, but I don't see how I could expose an RDD like this. I've found a comment that suggests I cannot.

或者我应该跑在我的应用程序服务器节俭用我自己的 SqlContext ?几乎周围所有的类都是私人,这code不是在Maven中环(据我看到的)。我应该使用 HiveThriftServer2.startWithContext ?它是无证和 @DeveloperApi ,但可能工作。

Or should I run the Thrift Server in my application with my own SqlContext? Almost all classes around it are private, and this code is not in Maven Central (as far as I see). Am I supposed to use HiveThriftServer2.startWithContext? It's undocumented and @DeveloperApi, but might work.

推荐答案

SPARK-3675 :

在邮件列表中一个常见的​​问题是如何从超过JDBC临时表中读取。虽然我们应该尝试和支持大多数的在SQL中,这也将是不错过JDBC来查询的通用RDDS。

A common question on the mailing list is how to read from temporary tables over JDBC. While we should try and support most of this in SQL, it would also be nice to query generic RDDs over JDBC.

和解决方案(以星火1.2.0推出)确实是使用 HiveThriftServer2.startWithContext

And the solution (coming in Spark 1.2.0) is indeed to use HiveThriftServer2.startWithContext.

这篇关于通过节俭服务器访问SQL星火RDD表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆