如何在数据集中使用java.time.LocalDate(与java.lang.UnsupportedOperationException失败:未找到编码器)? [英] How to use java.time.LocalDate in Datasets (fails with java.lang.UnsupportedOperationException: No Encoder found)?

查看:369
本文介绍了如何在数据集中使用java.time.LocalDate(与java.lang.UnsupportedOperationException失败:未找到编码器)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  • 火花2.1.1
  • Scala 2.11.8
  • Java 8
  • Linux Ubuntu 16.04 LTS

我想将我的RDD转换为数据集.为此,我使用implicits方法toDS()给了我以下错误:

I'd like to transform my RDD into a Dataset. For this, I use the implicits method toDS() that give me the following error:

Exception in thread "main" java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
- field (class: "java.time.LocalDate", name: "date")
- root class: "observatory.TemperatureRow"
    at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:602)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:596)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:587)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
    at scala.collection.immutable.List.flatMap(List.scala:344)

就我而言,我必须使用类型java.time.LocalDate,而不能使用java.sql.data.我已经读到我需要告知Spark如何将Java类型转换为Sql类型,在这个方向上,我构建了以下2个隐式函数:

In my case, I must use the type java.time.LocalDate, I can't use the java.sql.data. I have read that I need to informe Spark how transforme Java type into Sql type, I this direction, I build the 2 implicits functions below:

implicit def toSerialized(t: TemperatureRow): EncodedTemperatureRow = EncodedTemperatureRow(t.date.toString, t.location, t.temperature)
implicit def fromSerialized(t: EncodedTemperatureRow): TemperatureRow = TemperatureRow(LocalDate.parse(t.date), t.location, t.temperature)


下面,关于我的应用程序的一些代码:


Below, some code about my application:

case class Location(lat: Double, lon: Double)

case class TemperatureRow(
                             date: LocalDate,
                             location: Location,
                             temperature: Double
                         )

case class EncodedTemperatureRow(
                             date: String,
                             location: Location,
                             temperature: Double

val s = Seq[TemperatureRow](
                    TemperatureRow(LocalDate.parse("2017-01-01"), Location(1.4,5.1), 4.9),
                    TemperatureRow(LocalDate.parse("2014-04-05"), Location(1.5,2.5), 5.5)
                )

import spark.implicits._
val temps: RDD[TemperatureRow] = sc.parallelize(s)
val tempsDS = temps.toDS

我不知道为什么Spark为java.time.LocalDate搜索编码器,我为TemperatureRowEncodedTemperatureRow提供了隐式转换...

I don't know why Spark search an encoder for java.time.LocalDate, I provide implicit conversions for TemperatureRow and EncodedTemperatureRow...

推荐答案

java.time.LocalDate在Spark 2.2之前不受支持(并且一段时间以来,我一直在尝试为该类型编写Encoder,而失败).

java.time.LocalDate is not supported up to Spark 2.2 (and I've been trying to write an Encoder for the type for some time and failed).

您必须将java.time.LocalDate转换为其他受支持的类型(例如java.sql.Timestampjava.sql.Date),或者字符串中的纪元或日期时间.

You have to convert java.time.LocalDate to some other supported type (e.g. java.sql.Timestamp or java.sql.Date), or epoch or date-time in string.

这篇关于如何在数据集中使用java.time.LocalDate(与java.lang.UnsupportedOperationException失败:未找到编码器)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆