如何在 spark 2.0 中使用 Cassandra Context [英] how to use Cassandra Context in spark 2.0
问题描述
在以前的 Spark 版本(如 1.6.1)中,我正在使用 spark Context 创建 Cassandra Context,
In previous Version of Spark like 1.6.1, i am using creating Cassandra Context using spark Context,
import org.apache.spark.{ Logging, SparkContext, SparkConf }
//config
val conf: org.apache.spark.SparkConf = new SparkConf(true)
.set("spark.cassandra.connection.host", CassandraHost)
.setAppName(getClass.getSimpleName)
lazy val sc = new SparkContext(conf)
val cassandraSqlCtx: org.apache.spark.sql.cassandra.CassandraSQLContext = new CassandraSQLContext(sc)
//Query using Cassandra context
cassandraSqlCtx.sql("select id from table ")
但是在 Spark 2.0 中,Spark Context 被 Spark session 替换,我如何使用 cassandra 上下文?
But In Spark 2.0 , Spark Context is replaced with Spark session, how can i use cassandra context?
推荐答案
简短回答:您没有.它已被弃用和删除.
Short Answer: You don't. It has been deprecated and removed.
长答案:你不想.HiveContext 提供除目录之外的所有内容,并支持更广泛的 SQL(HQL~).在 Spark 2.0 中,这仅意味着您需要使用 createOrReplaceTempView 手动注册 Cassandra 表,直到实现 ExternalCatalogue.
Long Answer: You don't want to. The HiveContext provides everything except for the catalogue and supports a much wider range of SQL(HQL~). In Spark 2.0 this just means you will need to manually register Cassandra tables use createOrReplaceTempView until an ExternalCatalogue is implemented.
在Sql中这看起来像
spark.sql("""CREATE TEMPORARY TABLE words
|USING org.apache.spark.sql.cassandra
|OPTIONS (
| table "words",
| keyspace "test")""".stripMargin)
在原始 DF api 中它看起来像
In the raw DF api it looks like
spark
.read
.format("org.apache.spark.sql.cassandra")
.options(Map("keyspace" -> "test", "table" -> "words"))
.load
.createOrReplaceTempView("words")
这两个命令都会为 SQL 查询注册words"表.
Both of these commands will register the table "words" for SQL queries.
这篇关于如何在 spark 2.0 中使用 Cassandra Context的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!