如何为Option类型构造函数创建编码器,例如选项[Int]? [英] How to create encoder for Option type constructor, e.g. Option[Int]?
问题描述
是否可以在与Dataset API一起使用的案例类中使用Option[_]
成员?例如. Option[Int]
Is it possible to use Option[_]
member in a case class used with Dataset API? eg. Option[Int]
我试图找到一个例子,但找不到任何例子.可以使用自定义编码器(映射?)来完成此操作,但是我还没有找到一个示例.
I tried to find an example but could not find any yet. This can probably be done with with a custom encoder (mapping?) but I could not find an example for that yet.
使用无框架库可以实现: https://github.com/adelbertc/frameless 但应该有一个简单的方法可以通过基本的Spark库完成它.
This might be achievable using Frameless library: https://github.com/adelbertc/frameless but there should be an easy way to get it done with base Spark libraries.
更新
我正在使用:"org.apache.spark" %% "spark-core" % "1.6.1"
尝试使用Option [Int]时出现以下错误:
And getting the following error when trying to use an Option[Int]:
无法找到数据集中存储的类型的编码器.原始类型 (整数,字符串等)和产品类型(案例类)受以下支持: 导入sqlContext.implicits._支持序列化其他类型 将在以后的版本中添加
Unable to find encoder for type stored in a Dataset. Primitive types (Int, String, etc) and Product types (case classes) are supported by importing sqlContext.implicits._ Support for serializing other types will be added in future releases
解决方案更新
自从我进行原型设计以来,我只是在转换为数据集之前在函数中声明了case类(在我的案例中是在object Main {
内部).
Since I was prototyping I was just declaring the case class inside the function before the conversion to the Dataset (in my case inside object Main {
).
当我将案例类移到Main函数之外时,选项类型工作得很好.
Option types worked just fine when I moved the case class outside of the Main function.
推荐答案
我们仅为支持的类型的子集定义隐式自己构造所需的隐式(尽管此方法正在使用并且是内部API,所以在将来的发行版中可能会中断).
We only define implicits for a subset of the types we support in SQLImplicits. We should probably consider adding Option[T]
for common T
as the internal infrastructure does understand Option
. You can workaround this by either creating a case class
, using a Tuple
or constructing the required implicit yourself (though this is using and internal API so may break in future releases).
implicit def optionalInt: org.apache.spark.sql.Encoder[Option[Int]] = org.apache.spark.sql.catalyst.encoders.ExpressionEncoder()
val ds = Seq(Some(1), None).toDS()
这篇关于如何为Option类型构造函数创建编码器,例如选项[Int]?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!