pyspark sql: AttributeError: 'NoneType' 对象没有属性 'join' [英] pyspark sql : AttributeError: 'NoneType' object has no attribute 'join'

查看:137
本文介绍了pyspark sql: AttributeError: 'NoneType' 对象没有属性 'join'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

def main(inputs, output):

    sdf = spark.read.csv(inputs, schema=observation_schema)
    sdf.registerTempTable('filtertable')

    result = spark.sql("""
    SELECT * FROM filtertable WHERE qflag IS NULL
    """).show()

    temp_max = spark.sql(""" SELECT date, station, value FROM filtertable WHERE (observation = 'TMAX')""").show()
    temp_min = spark.sql(""" SELECT date, station, value FROM filtertable WHERE (observation = 'TMIN')""").show()

    result = temp_max.join(temp_min, condition1).select(temp_max('date'), temp_max('station'), ((temp_max('TMAX')-temp_min('TMIN'))/10)).alias('Range'))

错误:

Traceback (most recent call last):
  File "/Users/syedikram/Documents/temp_range_sql.py", line 96, in <module>
    main(inputs, output)
  File "/Users/syedikram/Documents/temp_range_sql.py", line 52, in main
    result = temp_max.join(temp_min, condition1).select(temp_max('date'), temp_max('station'), ((temp_max('TMAX')-temp_min('TMIN')/10)).alias('Range'))
AttributeError: 'NoneType' object has no attribute 'join'

执行连接操作给了我 Nonetype 对象错误.在线查看没有帮助,因为 pyspark sql 的在线文档很少.我在这里做错了什么?

Performing on join operation gives me Nonetype object error. Looking online didn't help as there is little documentation online for pyspark sql. What am I doing wrong here?

推荐答案

temp_maxtemp_min 中移除 .show() 因为 show 只打印一个字符串并且不返回任何东西(因此你得到 AttributeError: 'NoneType' object has no attribute 'join').

Remove the .show() from temp_max and temp_min because show only prints a string and does not return anything (hence you get AttributeError: 'NoneType' object has no attribute 'join').

这篇关于pyspark sql: AttributeError: 'NoneType' 对象没有属性 'join'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆