如何在数据集中使用 java.time.LocalDate(因 java.lang.UnsupportedOperationException: No Encoder found 而失败)? [英] How to use java.time.LocalDate in Datasets (fails with java.lang.UnsupportedOperationException: No Encoder found)?

查看:32
本文介绍了如何在数据集中使用 java.time.LocalDate(因 java.lang.UnsupportedOperationException: No Encoder found 而失败)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  • Spark 2.1.1
  • Scala 2.11.8
  • Java 8
  • Linux Ubuntu 16.04 LTS

我想将我的 RDD 转换为数据集.为此,我使用 implicits 方法 toDS() 给我以下错误:

I'd like to transform my RDD into a Dataset. For this, I use the implicits method toDS() that give me the following error:

Exception in thread "main" java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
- field (class: "java.time.LocalDate", name: "date")
- root class: "observatory.TemperatureRow"
    at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:602)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:596)
    at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:587)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.immutable.List.foreach(List.scala:381)
    at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
    at scala.collection.immutable.List.flatMap(List.scala:344)

就我而言,我必须使用类型 java.time.LocalDate,我不能使用 java.sql.data.我已经读到我需要通知 Spark 如何将 Java 类型转换为 Sql 类型,我这个方向,我在下面构建了 2 个隐式函数:

In my case, I must use the type java.time.LocalDate, I can't use the java.sql.data. I have read that I need to informe Spark how transforme Java type into Sql type, I this direction, I build the 2 implicits functions below:

implicit def toSerialized(t: TemperatureRow): EncodedTemperatureRow = EncodedTemperatureRow(t.date.toString, t.location, t.temperature)
implicit def fromSerialized(t: EncodedTemperatureRow): TemperatureRow = TemperatureRow(LocalDate.parse(t.date), t.location, t.temperature)

<小时>

下面是关于我的应用程序的一些代码:


Below, some code about my application:

case class Location(lat: Double, lon: Double)

case class TemperatureRow(
                             date: LocalDate,
                             location: Location,
                             temperature: Double
                         )

case class EncodedTemperatureRow(
                             date: String,
                             location: Location,
                             temperature: Double

val s = Seq[TemperatureRow](
                    TemperatureRow(LocalDate.parse("2017-01-01"), Location(1.4,5.1), 4.9),
                    TemperatureRow(LocalDate.parse("2014-04-05"), Location(1.5,2.5), 5.5)
                )

import spark.implicits._
val temps: RDD[TemperatureRow] = sc.parallelize(s)
val tempsDS = temps.toDS

我不知道为什么 Spark 搜索 java.time.LocalDate 的编码器,我为 TemperatureRowEncodedTemperatureRow 提供了隐式转换...

I don't know why Spark search an encoder for java.time.LocalDate, I provide implicit conversions for TemperatureRow and EncodedTemperatureRow...

推荐答案

java.time.LocalDate 不支持直到 Spark 2.2(我一直在尝试编写一个 Encoder 对于该类型一段时间并且 失败).

java.time.LocalDate is not supported up to Spark 2.2 (and I've been trying to write an Encoder for the type for some time and failed).

您必须将 java.time.LocalDate 转换为其他支持的类型(例如 java.sql.Timestampjava.sql.Date>),或字符串中的纪元或日期时间.

You have to convert java.time.LocalDate to some other supported type (e.g. java.sql.Timestamp or java.sql.Date), or epoch or date-time in string.

这篇关于如何在数据集中使用 java.time.LocalDate(因 java.lang.UnsupportedOperationException: No Encoder found 而失败)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆