“无法找到存储在数据集中的类型的编码器"和“方法映射的参数不足"? [英] "Unable to find encoder for type stored in a Dataset" and "not enough arguments for method map"?

查看:48
本文介绍了“无法找到存储在数据集中的类型的编码器"和“方法映射的参数不足"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下代码在最后一个 map(...)上出现了两个错误. map()函数缺少什么参数?如何解决编码器"错误?

The following code got the two errors on the last map(...). What parameter is missing in the map() function? How to resolve the error of "encoder"?

错误:


Error:(60, 11) Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
      .map(r => Cols(r.getInt(0), r.getString(1), r.getString(2), r.getString(3), r.getDouble(4), r.getDate(5), r.getString(6), r.getString(7), r.getDouble(8), r.getString(9)))

Error:(60, 11) not enough arguments for method map: (implicit evidence$6: org.apache.spark.sql.Encoder[Cols])org.apache.spark.sql.Dataset[Cols].
Unspecified value parameter evidence$6.
      .map(r => Cols(r.getInt(0), r.getString(1), r.getString(2), r.getString(3), r.getDouble(4), r.getDate(5), r.getString(6), r.getString(7), r.getDouble(8), r.getString(9)))

代码:

  case class Cols (A: Int,
                   B: String,
                   C: String,
                   D: String,
                   E: Double,
                   F: Date,
                   G: String,
                   H: String,
                   I: Double,
                   J: String
                  )

class SqlData(sqlContext: org.apache.spark.sql.SQLContext, jdbcSqlConn: String) {
  def getAll(source: String) = {
    sqlContext.read.format("jdbc").options(Map(
      "driver" -> "com.microsoft.sqlserver.jdbc.SQLServerDriver",
      "url" -> jdbcSqlConn,
      "dbtable" -> s"MyFunction('$source')"
    )).load()
      .select("A", "B", "C", "D", "E", "F", "G", "H", "I", "J")
      // The following line(60) got the errors.
      .map((r) => Cols(r.getInt(0), r.getString(1), r.getString(2), r.getString(3), r.getDouble(4), r.getDate(5), r.getString(6), r.getString(7), r.getDouble(8), r.getString(9)))
  }
}


更新:

我具有以下功能

def compare(sqlContext: org.apache.spark.sql.SQLContext, dbo: Dataset[Cols], ods: Dataset[Cols]) = {
    import sqlContext.implicits._
    dbo.map((r) => ods.map((s) => { // Errors occur here
      0
    }))

,并且出现相同的错误.

and it got the same error.

  1. 为什么在导入 sqlContext.implicits ._ 后仍然有错误?
  2. 我创建一个新参数 sqlContext 只是为了导入.有更好的方法吗?
  1. Why it still has the error after I imported sqlContext.implicits._?
  2. I create a new parameter sqlContext simply for importing. Is there a better way do it?

推荐答案

将所有评论组合成一个答案:

Combining all the comments into an answer:

def getAll(source: String): Dataset[Cols] = {
  import sqlContext.implicits._ // this imports the necessary implicit Encoders

  sqlContext.read.format("jdbc").options(Map(
    "driver" -> "com.microsoft.sqlserver.jdbc.SQLServerDriver",
    "url" -> jdbcSqlConn,
    "dbtable" -> s"MyFunction('$source')"
  )).load().as[Cols] // shorter way to convert into Cols, thanks @T.Gaweda
}

这篇关于“无法找到存储在数据集中的类型的编码器"和“方法映射的参数不足"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆