如何使用IntelliJ上的Scala从spark连接到Hive? [英] How do I connect to Hive from spark using Scala on IntelliJ?

查看:222
本文介绍了如何使用IntelliJ上的Scala从spark连接到Hive?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对hive和spark不熟悉,并试图找到一种方法来访问hive中的表以操纵和访问数据.怎么办?

I am new to hive and spark and am trying to figure out a way to access tables in hive to manipulate and access the data. How can it be done?

推荐答案

在Spark中<2.0

in spark < 2.0

 val sc = new SparkContext(conf)

 val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
 val myDataFrame = sqlContext.sql("select * from mydb.mytable")

在更高版本的spark中,使用SparkSession:

in later versions of spark, use SparkSession:

SparkSession现在是Spark的新入口点,取代了旧的SQLContext和HiveContext.请注意,旧的SQLContext和保留HiveContext是为了向后兼容.新目录可以从SparkSession访问接口-数据库上的现有API和表访问,例如listTables,createExternalTable,dropTempView,cacheTable都移到了这里.-来自文档

SparkSession is now the new entry point of Spark that replaces the old SQLContext and HiveContext. Note that the old SQLContext and HiveContext are kept for backward compatibility. A new catalog interface is accessible from SparkSession - existing API on databases and tables access such as listTables, createExternalTable, dropTempView, cacheTable are moved here. -- from the docs

val spark = SparkSession
  .builder()
  .appName("Spark Hive Example")
  .config("spark.sql.warehouse.dir", warehouseLocation)
  .enableHiveSupport()
  .getOrCreate()
 val myDataFrame = spark.sql("select * from mydb.mytable")

这篇关于如何使用IntelliJ上的Scala从spark连接到Hive?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆