Python API中是否提供Spark SQL UDAF(用户定义的聚合函数)? [英] Is Spark SQL UDAF (user defined aggregate function) available in the Python API?

查看:235
本文介绍了Python API中是否提供Spark SQL UDAF(用户定义的聚合函数)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从Spark 1.5.0开始,似乎可以为DataFrames上的自定义聚合编写自己的UDAF: Spark 1.5 DataFrame API的亮点:日期/时间/字符串处理,时间间隔和UDAF

As of Spark 1.5.0 it seems possible to write your own UDAF's for custom aggregations on DataFrames: Spark 1.5 DataFrame API Highlights: Date/Time/String Handling, Time Intervals, and UDAFs

但是我不清楚Python API是否支持此功能?

It is however unclear to me if this functionality is supported in the Python API?

推荐答案

您无法在Spark 1.5.0-2.0.0中定义Python UDAF.有一个JIRA跟踪此功能请求:

You cannot defined Python UDAF in Spark 1.5.0-2.0.0. There is a JIRA tracking this feature request:

以稍后"目标解决,因此它可能不会很快发生.

resolved with goal "later" so it probably won't happen anytime soon.

您可以从PySpark使用Scala UDAF-对此进行了介绍 Spark:如何使用Scala或Java用户定义函数映射Python?

You can use Scala UDAF from PySpark - it is described Spark: How to map Python with Scala or Java User Defined Functions?

这篇关于Python API中是否提供Spark SQL UDAF(用户定义的聚合函数)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆