自Spark 2.X起,无法使用scala.None值创建org.apache.spark.sql.Row [英] Unable to create org.apache.spark.sql.Row with scala.None value since Spark 2.X

查看:263
本文介绍了自Spark 2.X起,无法使用scala.None值创建org.apache.spark.sql.Row的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于Spark 2.X无法使用scala.None值创建org.apache.spark.sql.Row(Spark 1.6.X可能存在)

Caused by: java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: scala.None$ is not a valid external type for schema of string

可复制的示例:

 import org.apache.spark.sql.types._
import org.apache.spark.sql.Row

spark.createDataFrame(
  sc.parallelize(Seq(Row(None))),
  StructType(Seq(StructField("v", StringType, true)))
).first
 

要点: https://gist.github.com/AleksandrPavlenko/bef1c34458883730cc319b2e7378c8c6

好像已在 SPARK-15657 中进行了更改(不确定,仍在尝试证明这一点)

解决方案

这是预期行为,如 SPARK-19056 (行编码器应接受可选类型):

这是故意的.从来没有记录允许在Row中使用Option,并且在将编码器框架应用于所有类型的操作时会带来很多麻烦.从Spark 2.0开始,请对键入的操作/自定义对象使用Dataset

Since Spark 2.X unable to create org.apache.spark.sql.Row with scala.None value (it was possible for Spark 1.6.X)

Caused by: java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: scala.None$ is not a valid external type for schema of string

Reproducible example:

import org.apache.spark.sql.types._
import org.apache.spark.sql.Row

spark.createDataFrame(
  sc.parallelize(Seq(Row(None))),
  StructType(Seq(StructField("v", StringType, true)))
).first

Gist: https://gist.github.com/AleksandrPavlenko/bef1c34458883730cc319b2e7378c8c6

Looks like it was changed in SPARK-15657 (not sure, still trying to prove it)

解决方案

This is an expected behavior as described in SPARK-19056 (Row encoder should accept optional types):

This is intentional. Allowing Option in Row is never documented and brings a lot of troubles when we apply the encoder framework to all typed operations. Since Spark 2.0, please use Dataset for typed operation/custom objects

这篇关于自Spark 2.X起,无法使用scala.None值创建org.apache.spark.sql.Row的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆