如何访问Hive中的现有表？ [英] How to access existing table in Hive?

查看：326 发布时间：2018/6/12 14:05:18 scala apache-spark hive apache-spark-sql

本文介绍了如何访问Hive中的现有表？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我的代码：

我试图通过scala访问spark应用程序中的HIVE。 >  val hiveLocation =hdfs：// master：9000 / user / hive / warehouse
 val conf = new SparkConf（）。setAppName（SOME APP NAME）。setMaster [*]）。set（spark.sql.warehouse.dir，hiveLocation）
 
 val sc = new SparkContext（conf）
 val spark = SparkSession 
 .builder （）
 .appName（SparkHiveExample）
 .master（local [*]）
 .config（spark.sql.warehouse.dir，hiveLocation）
 .config（spark.driver.allowMultipleContexts，true）
 .enableHiveSupport（）
 .getOrCreate（）
 println（SQL Session开始-------- ------------）
 
 spark.sql（select * from test）。show（）
 println（SQL会话结束 - -----------------）

但它

未找到表格或视图

但是当我运行 show tables; un时der hive控制台，我可以看到该表并可以运行 Select * from test 。全部位于用户/配置/仓库位置。只是为了测试，我试着用spark也创建表，以找出表的位置。

$ b .builder（）
.appName（SparkHiveExample）
.master（local [*]）
.config（spark.sql.warehouse.dir，hiveLocation）
.config（spark.driver.allowMultipleContexts，true）
.enableHiveSupport（）
.getOrCreate（）
println（SQL Session ---- ----------------）
spark.sql（CREATE TABLE IF NOT EXISTS test11（name String））
println（SQL会话结束-------------------）

此代码也正确执行（与成功注意事项），但奇怪的是，我可以从蜂房控制台找到此表。

即使我使用从TBLS选择*;在mysql中（在我的设置中，我将mysql配置为存储单元的Metastore ），我没有找到那些由spark创建的表格。

 
 
 火花位置是否与蜂房控制台不同？
 
 
 如果我需要访问现有的表格中的火花吗？
解决方案
来自 spark sql编程指南：
（我突出了相关部分）
 
 
 
  配置Hive通过放置您的hive-site.xml ，
 core-site.xml（用于安全配置）和在conf /中使用hdfs-site.xml文件（用于
 HDFS配置）。
 
 
 使用Hive时，必须实例化Hive 
支持的SparkSession包括连接到持久性Hive Metastore，
支持Hive serdes和Hive用户定义的函数。执行
的用户不具有现有的Hive部署，仍然可以启用Hive支持。 
 未由hive-site.xml配置时，上下文自动
在当前目录中创建metastore_db，并创建一个由spark.sql.warehouse配置的目录
 .dir，默认为目录中的
 spark-warehouse Spark应用程序为
的当前目录中启动 
 
 
 
 <你需要在资源目录中添加一个 hive-site.xml 配置文件。 
这里是Spark与Hive一起工作所需的最小值（将主机设置为配置单元主机）： 
 
 
 <？xml version =1.0？> 
<？xml-stylesheet type =text / xslhref =configuration.xsl？> 
<配置> 
<属性> 
< name> hive.metastore.uris< / name> 
< value> thrift：// host：9083< / value> 
< description>元存储主机的IP地址（或完全合格的域名）和端口< / description> 
< / property> 
 
< / configuration> 
  
 
I am trying to access HIVE from spark application with scala.


My code:
val hiveLocation   = "hdfs://master:9000/user/hive/warehouse"
val conf = new SparkConf().setAppName("SOME APP NAME").setMaster("local[*]").set("spark.sql.warehouse.dir",hiveLocation)

val sc = new SparkContext(conf)
val spark = SparkSession
  .builder()
  .appName("SparkHiveExample")
  .master("local[*]")
  .config("spark.sql.warehouse.dir", hiveLocation)
  .config("spark.driver.allowMultipleContexts", "true")
  .enableHiveSupport()
  .getOrCreate()
println("Start of SQL Session--------------------")

spark.sql("select * from test").show()
println("End of SQL session-------------------")
But it ends up with error message 

  Table or view not found
but when I run show tables; under hive console , I can see that table and can run Select * from test. All are in "user/hive/warehouse" location. Just for testing I tried with create table also from spark, just to find out the table location.
val spark = SparkSession
      .builder()
  .appName("SparkHiveExample")
  .master("local[*]")
  .config("spark.sql.warehouse.dir", hiveLocation)
  .config("spark.driver.allowMultipleContexts", "true")
  .enableHiveSupport()
    .getOrCreate()
println("Start of SQL Session--------------------")
spark.sql("CREATE TABLE IF NOT EXISTS test11(name String)")
println("End of SQL session-------------------")
This code also executed properly (with success note) but strange thing is that I can find this table from hive console. 

Even if I use select * from TBLS; in mysql (in my setup I configured mysql as metastore for hive), I did not found those tables which are created from spark. 

Is spark location is different than hive console?

What I have to do if I need to access existing table in hive from spark?
 解决方案 
from the spark sql programming guide:
(I highlighted the relevant parts)

  Configuration of Hive is done by placing your hive-site.xml,
  core-site.xml (for security configuration), and hdfs-site.xml (for
  HDFS configuration) file in conf/.
  
  When working with Hive, one must instantiate SparkSession with Hive
  support, including connectivity to a persistent Hive metastore,
  support for Hive serdes, and Hive user-defined functions. Users who do
  not have an existing Hive deployment can still enable Hive support.
  When not configured by the hive-site.xml, the context automatically
  creates metastore_db in the current directory and creates a directory
  configured by spark.sql.warehouse.dir, which defaults to the directory
  spark-warehouse in the current directory that the Spark application is
  started
you need to add a hive-site.xml config file to the resource dir.
here is the minimum needed values for spark to work with hive (set the host to the host of hive):
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>hive.metastore.uris</name>
        <value>thrift://host:9083</value>
        <description>IP address (or fully-qualified domain name) and port of the metastore host</description>
    </property>

</configuration>


                        
这篇关于如何访问Hive中的现有表？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何访问Hive中的现有表？ [英] How to access existing table in Hive?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何访问Hive中的现有表？ [英] How to access existing table in Hive?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭