如何在数据集中使用 java.time.LocalDate(因 java.lang.UnsupportedOperationException: No Encoder found 而失败)? [英] How to use java.time.LocalDate in Datasets (fails with java.lang.UnsupportedOperationException: No Encoder found)?
问题描述
- Spark 2.1.1
- Scala 2.11.8
- Java 8
- Linux Ubuntu 16.04 LTS
我想将我的 RDD 转换为数据集.为此,我使用 implicits
方法 toDS()
给我以下错误:
I'd like to transform my RDD into a Dataset. For this, I use the implicits
method toDS()
that give me the following error:
Exception in thread "main" java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
- field (class: "java.time.LocalDate", name: "date")
- root class: "observatory.TemperatureRow"
at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:602)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:596)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:587)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344)
就我而言,我必须使用类型 java.time.LocalDate
,我不能使用 java.sql.data
.我已经读到我需要通知 Spark 如何将 Java 类型转换为 Sql 类型,我这个方向,我在下面构建了 2 个隐式函数:
In my case, I must use the type java.time.LocalDate
, I can't use the java.sql.data
. I have read that I need to informe Spark how transforme Java type into Sql type, I this direction, I build the 2 implicits functions below:
implicit def toSerialized(t: TemperatureRow): EncodedTemperatureRow = EncodedTemperatureRow(t.date.toString, t.location, t.temperature)
implicit def fromSerialized(t: EncodedTemperatureRow): TemperatureRow = TemperatureRow(LocalDate.parse(t.date), t.location, t.temperature)
<小时>
下面是关于我的应用程序的一些代码:
Below, some code about my application:
case class Location(lat: Double, lon: Double)
case class TemperatureRow(
date: LocalDate,
location: Location,
temperature: Double
)
case class EncodedTemperatureRow(
date: String,
location: Location,
temperature: Double
val s = Seq[TemperatureRow](
TemperatureRow(LocalDate.parse("2017-01-01"), Location(1.4,5.1), 4.9),
TemperatureRow(LocalDate.parse("2014-04-05"), Location(1.5,2.5), 5.5)
)
import spark.implicits._
val temps: RDD[TemperatureRow] = sc.parallelize(s)
val tempsDS = temps.toDS
我不知道为什么 Spark 搜索 java.time.LocalDate
的编码器,我为 TemperatureRow
和 EncodedTemperatureRow
提供了隐式转换...
I don't know why Spark search an encoder for java.time.LocalDate
, I provide implicit conversions for TemperatureRow
and EncodedTemperatureRow
...
推荐答案
java.time.LocalDate
不支持直到 Spark 2.2(我一直在尝试编写一个 Encoder
对于该类型一段时间并且 失败).
java.time.LocalDate
is not supported up to Spark 2.2 (and I've been trying to write an Encoder
for the type for some time and failed).
您必须将 java.time.LocalDate
转换为其他支持的类型(例如 java.sql.Timestamp
或 java.sql.Date
>),或字符串中的纪元或日期时间.
You have to convert java.time.LocalDate
to some other supported type (e.g. java.sql.Timestamp
or java.sql.Date
), or epoch or date-time in string.
这篇关于如何在数据集中使用 java.time.LocalDate(因 java.lang.UnsupportedOperationException: No Encoder found 而失败)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!