toDF 的问题,值 toDF 不是 org.apache.spark.rdd.RDD 的成员 [英] Issue with toDF, Value toDF is not a member of org.apache.spark.rdd.RDD
问题描述
我附加了错误值 toDF 不是 org.apache.spark.rdd.RDD 的成员"的代码片段.我正在使用 Scala 2.11.8 和 spark 2.0.0.你能帮我解决 API toDF() 的这个问题吗?
I have attached code snippet for error "value toDF is not a member of org.apache.spark.rdd.RDD". I am using scala 2.11.8 and spark 2.0.0. Can you please help me to resolve this issue for API toDF()?
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.{Row, SparkSession}
import org.apache.spark.sql.functions._
object HHService {
case class Services(
uhid:String,
locationid:String,
doctorid:String,
billdate:String,
servicename:String,
servicequantity:String,
starttime:String,
endtime:String,
servicetype:String,
servicecategory:String,
deptname:String
)
def toService = (p: Seq[String]) => Services(p(0), p(1),p(2),p(3),p(4),p(5),p(6),p(7),p(8),p(9),p(10))
def main(args: Array[String]){
val warehouseLocation = "file:${system:user.dir}/spark-warehouse"
val spark = SparkSession
.builder
.appName(getClass.getSimpleName)
.config("spark.sql.warehouse.dir", warehouseLocation)
.enableHiveSupport()
.getOrCreate()
val sc = spark.sparkContext
val sqlContext = spark.sqlContext;
import spark.implicits._
import sqlContext.implicits._
val hospitalDataText = sc.textFile("D:/Books/bboks/spark/Intellipaat/Download/SparkHH/SparkHH/services.csv")
val header = hospitalDataText.first()
val hospitalData= hospitalDataText.filter(a => a!= header)
//val HData = hospitalData.map(_.split(",")).map(p=>Services(p(0), p(1),p(2),p(3),p(4),p(5),p(6),p(7),p(8),p(9),p(10)))
val HData = hospitalData.map(_.split(",")).map(toService(_))
val hosService=HData.toDF()
}
}
推荐答案
1] 需要获取 sqlContext 如下.
1] Need to get sqlContext as below.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
这解决了我的问题.前面的代码片段用于获取 sqlcontext.val sqlContext = spark.sqlContext(这种方式与 spark-shell 一起使用)
This solved my issue. Earlier below code snippet is used to get sqlcontext. val sqlContext = spark.sqlContext (This way it is worked with spark-shell)
2]案例类需要退出方法.大多数博客中也提到了这一点.
2] case class need to be out of method. This is also mentioned in most of the blogs.
这篇关于toDF 的问题,值 toDF 不是 org.apache.spark.rdd.RDD 的成员的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!