Spark:java.lang.UnsupportedOperationException:找不到java.time.LocalDate的编码器 [英] Spark: java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
问题描述
我正在使用2.1.1版编写一个Spark应用程序.以下代码在调用带有LocalDate参数的方法时出现错误?
I'm writing a Spark application using version 2.1.1. The following code got the error when calling a method with LocalDate parameter?
Exception in thread "main" java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
- field (class: "java.time.LocalDate", name: "_2")
- root class: "scala.Tuple2"
at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:602)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:596)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$9.apply(ScalaReflection.scala:587)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344)
at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:587)
....
val date : LocalDate = ....
val conf = new SparkConf()
val sc = new SparkContext(conf.setAppName("Test").setMaster("local[*]"))
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val itemListJob = new ItemList(sqlContext, jdbcSqlConn)
import sqlContext.implicits._
val processed = itemListJob.run(rc, priority).select("id").map(d => {
runJob.run(d, date)
})
class ItemList(sqlContext: org.apache.spark.sql.SQLContext, jdbcSqlConn: String) {
def run(date: LocalDate) = {
import sqlContext.implicits._
sqlContext.read.format("jdbc").options(Map(
"driver" -> "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"url" -> jdbcSqlConn,
"dbtable" -> s"dbo.GetList('$date')"
)).load()
.select("id")
.as[Int]
}
}
更新:
我将runJob.run()
的返回类型更改为元组(int, java.sql.Date)
,并将.map(...)
的lambda中的代码更改为
Update:
I changed the return type of runJob.run()
to tuple (int, java.sql.Date)
and changed the code in the lambda of .map(...)
to
val processed = itemListJob.run(rc, priority).select("id").map(d => {
val (a,b) = runJob.run(d, date)
$"$a, $b"
})
现在错误更改为
[error] C:\....\scala\main.scala:40: Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing spark.implicits._ Support for serializing other types will be added in future releases.
[error] val processed = itemListJob.run(rc, priority).map(d => {
[error] ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
推荐答案
对于自定义数据集类型,只要您的数据实际上是可序列化的(也可以实现Serializable),就可以使用Kyro serde框架.这是使用Kyro的一个示例:
for custom dataset type, you can use Kyro serde framework, as long as your data is actually serializable(aka. implements Serializable). here is one example of using Kyro: Spark No Encoder found for java.io.Serializable in Map[String, java.io.Serializable].
Kyro,因为它速度更快并且还与Java serde框架兼容.您当然可以选择Java本机Serde(ObjectWriter/ObjectReader),但速度要慢得多.
Kyro is always recommended since it's much faster and also compatible with Java serde framework. you can definitely choose Java native serde(ObjectWriter/ObjectReader) but it's much slower.
就像上面的注释一样,SparkSQL在sqlContext.implicits._
下附带了许多有用的编码器,但是并不能涵盖所有内容,因此您可能必须插入自己的编码器.
like the comments above, SparkSQL comes with lots of useful Encoders under sqlContext.implicits._
, but that won't cover everything, so you might have to plugin your own Encoder.
就像我说的那样,您的自定义数据必须可序列化,并且根据 https://docs.oracle.com/javase/8/docs/api/java/time/LocalDate.html ,它实现了Serializable接口,因此在这里绝对不错.
Like I said, your custom data has to be serializable, and according to https://docs.oracle.com/javase/8/docs/api/java/time/LocalDate.html, it implements Serializable interface, so you are definitely good here.
这篇关于Spark:java.lang.UnsupportedOperationException:找不到java.time.LocalDate的编码器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!