PySpark - 枢轴所需的聚合表达式,找到“pythonUDF" [英] PySpark - Aggregate expression required for pivot, found 'pythonUDF'
问题描述
我使用的是 Python 2.6.6 和 Spark 1.6.0.我有这样的 df:
I am using Python 2.6.6 and Spark 1.6.0. I have df like this:
id | name | number |
--------------------------
1 | joe | 148590 |
2 | bob | 148590 |
2 | steve | 279109 |
3 | sue | 382901 |
3 | linda | 148590 |
每当我尝试运行类似df2 = df.groupBy('id','length','type').pivot('id').agg(F.collect_list('name'))
,我收到以下错误 pyspark.sql.utils.AnalysisException: u"Aggregate expression required for pivot, found 'pythonUDF#93';"
这是为什么?
Whenever I try to run something like
df2 = df.groupBy('id','length','type').pivot('id').agg(F.collect_list('name'))
,
I get the following error pyspark.sql.utils.AnalysisException: u"Aggregate expression required for pivot, found 'pythonUDF#93';"
Why is this?
推荐答案
已解决.我使用 SQLContext 创建原始数据框.已更改为 HiveContext.
Resolved. I used SQLContext to create the original data frame. Changed to HiveContext.
这篇关于PySpark - 枢轴所需的聚合表达式,找到“pythonUDF"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!