从包含Option [T]的行创建DataFrame的问题 [英] Problems to create DataFrame from Rows containing Option[T]

查看:52
本文介绍了从包含Option [T]的行创建DataFrame的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将一些代码从Spark 1.6迁移到Spark 2.1,并遇到以下问题:

I'm migrating some code from Spark 1.6 to Spark 2.1 and struggling with the following issue:

这在Spark 1.6中效果很好

This worked perfectly in Spark 1.6

import org.apache.spark.sql.types.{LongType, StructField, StructType}  

val schema = StructType(Seq(StructField("i", LongType,nullable=true)))    
val rows = sparkContext.parallelize(Seq(Row(Some(1L))))
sqlContext.createDataFrame(rows,schema).show

Spark 2.1.1中的相同代码:

The same code in Spark 2.1.1:

import org.apache.spark.sql.types.{FloatType, LongType, StructField, StructType}

val schema = StructType(Seq(StructField("i", LongType,nullable=true)))
val rows = ss.sparkContext.parallelize(Seq(Row(Some(1L))))
ss.createDataFrame(rows,schema).show

给出以下运行时异常:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 8.0 failed 4 times, most recent failure: Lost task 0.3 in stage 8.0 (TID 72, i89203.sbb.ch, executor 9): java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: scala.Some is not a valid external type for schema of bigint

如果我想使用可为空的 Long 而不是使用 Option [Long] ,那么应该如何将此类代码转换为Spark 2.x?

So how should I translate such code to Spark 2.x if I want to have nullable Long's rather than using Option[Long]?

推荐答案

实际上有一个JIRA

There is actually an JIRA SPARK-19056 about this issue which is not actually one.

因此,此行为是故意的.

So this behavior is intentional.

Row 中允许 Option 从未被记录,并且在将编码器框架应用于所有类型的操作时会带来很多麻烦.从Spark 2.0开始,请对键入的操作/自定义对象使用 Dataset .例如

Allowing Option in Row is never documented and brings a lot of troubles when we apply the encoder framework to all typed operations. Since Spark 2.0, please use Dataset for typed operation/custom objects. e.g.

val ds = Seq(1 -> None, 2 -> Some("str")).toDS
ds.toDF // schema: <_1: int, _2: string>

这篇关于从包含Option [T]的行创建DataFrame的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆