“无法找到存储在数据集中的类型的编码器"甚至spark.implicits._都被导入了吗? [英] "Unable to find encoder for type stored in a Dataset" even spark.implicits._ is imported?

查看:32
本文介绍了“无法找到存储在数据集中的类型的编码器"甚至spark.implicits._都被导入了吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


Error:(39, 12) Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._  Support for serializing other types will be added in future releases.
    dbo.map((r) => ods.map((s) => {

Error:(39, 12) not enough arguments for method map: (implicit evidence$6: org.apache.spark.sql.Encoder[org.apache.spark.sql.Dataset[Int]])org.apache.spark.sql.Dataset[org.apache.spark.sql.Dataset[Int]].
Unspecified value parameter evidence$6.
    dbo.map((r) => ods.map((s) => {

object Main extends App {
  ....

  def compare(sqlContext: org.apache.spark.sql.SQLContext, 
            dbo: Dataset[Cols], ods: Dataset[Cols]) = {
    import sqlContext.implicits._ // Tried import dbo.sparkSession.implicits._ too
    dbo.map((r) => ods.map((s) => { // Errors occur here
      0
    }))
}

case class Cols (A: Int,
                   B: String,
                   C: String,
                   D: String,
                   E: Double,
                   F: Date,
                   G: String,
                   H: String,
                   I: Double,
                   J: String
                  )

  1. 为什么在导入 sqlContext.implicits ._ 后仍然有错误?
  2. 我创建的参数 sqlContext 仅用于导入.有没有更好的方法,而无需传递参数?
  1. Why it still has the error after I imported sqlContext.implicits._?
  2. I created the parameter sqlContext only for importing. Is there a better way do it without passing a parameter?

这应该通过 import dbo.sparkSession.implicits ._

推荐答案

您的代码正在尝试创建一个数据集[Dataset [Int]],出于某些原因这是错误的

Your code is trying to create a Dataset[Dataset[Int]], that's wrong for several reasons

您不能使用数据集中的数据集,如果您想交叉2个数据集中的数据,则需要以某种方式将其加入

You can't use datasets inside a dataset, if you want to cross data from 2 datasets you need to join them somehow

无法创建Encoder [Dataset [Int]],您可以拥有Encoder [Int],但是另一件事没有意义

There's no way a Encoder[Dataset[Int]] can be created, you could have Encoder[Int] but the other thing makes no sense

这样的事情更有意义

import org.apache.spark.sql.functions => func

dbo.joinWith(ods, func.expr("true")).map {
  case (r, s) =>
    0
}

这篇关于“无法找到存储在数据集中的类型的编码器"甚至spark.implicits._都被导入了吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆