如何在Spark 2.0中使用用户定义类型? [英] How to use User Defined Types in Spark 2.0?

查看:105
本文介绍了如何在Spark 2.0中使用用户定义类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Spark 2.0中,

In Spark 2.0, the one example I've found of creating a UDT in Scala seems to no longer be applicable. The UserDefinedType class has been set as private, with the comment:

注意:这以前是Spark 1.x中的开发人员API.我们在Spark 2.0中将其设为私有,因为我们很可能会创建一个新版本,以更好地与数据集配合使用.

Note: This was previously a developer API in Spark 1.x. We are making this private in Spark 2.0 because we will very likely create a new version of this that works better with Datasets.

这可能是

It might be the intent of UDTRegistration to be the new mechanism of declaring UDT, but it's also private.

到目前为止,我的研究告诉我,尚无办法在Spark 2.0中声明自己的UDT;这个结论正确吗?

So far, my research tells me that there is no way to declare your own UDTs in Spark 2.0; is this conclusion correct?

推荐答案

现在您是对的,Spark 2.x不再像Spark 1.x一样具有用作API的任何类型的UDT.

Well you are right for now, the Spark 2.x has no more any kind of UDT to use as an API that was like in Spark 1.x.

您可以在此故障单中看到他们制作的 SPARK-14155 私有创建一个新的API.我们希望自Spark 1.5起开放一个故障单,希望在Spark 2.2中将其关闭. SPARK -7768 .

You can see here in this ticket SPARK-14155 that they make it privet to create a new API. That we have a ticket open since Spark 1.5 that we wish that will be closed in Spark 2.2 SPARK-7768.

好吧,类型现在不足以创建您的UDT,但是...您可以使用一些技巧来将自定义对象设置为DataSet. 此处是一个示例.

Well, types are not good for now to create your UDT but... There few tricks that you can set your custom objects to a DataSet. Here is one example.

这篇关于如何在Spark 2.0中使用用户定义类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆