AttributeError：“ NoneType”对象没有属性“ setCallSite” [英] AttributeError: 'NoneType' object has no attribute 'setCallSite'

查看：670 发布时间：2020/10/10 1:28:35 python pyspark statistics apache-spark-sql correlation

本文介绍了AttributeError：“ NoneType”对象没有属性“ setCallSite”的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在PySpark中，我想使用以下代码计算两个数据帧向量之间的相关性（导入pyspark或createDataFrame时我没有任何问题）：

<$ p来自pyspark.ml.linalg的$ p>

导入来自pyspark.ml.stat的
导入相关性
导入pyspark 
 
 spark = pyspark.sql.SparkSession .builder.master（ local [*]）。getOrCreate（）
 
 data = [（Vectors.sparse（4，[（0，1.0），（3，-2.0）]））， ），
（Vectors.dense（[4.0，5.0，0.0，3.0]），）] 
 df = spark.createDataFrame（data，[ features]）
 
 r1 = Correlation.corr（df， features）。head（）
 print（皮尔逊相关矩阵：\n + str（r1 [0]））

但是，我得到了AttributeError（AttributeError： NoneType对象没有属性 setCallSite）：

  AttributeError跟踪（最近一次调用最近）
< ipython-input-136-d553c1ade793>在< module>（）
 6 df = spark.createDataFrame（data，[ features]）
 7 
 ----> 8 r1 = Correlation.corr（df， features）。head（）
 9 print（ Pearson相关矩阵：\n + str（r1 [0]））
 
 /usr/local/lib/python3.6/dist-packages/pyspark/sql/dataframe.py in head（self，n）
 1130 
 1131如果n为None：
-> 1132 rs = self.head（1）
 1133返回rs [0]如果rs其他无
 1134返回self.take（n）
 
 / usr / head（self，n）
 1132 rs中的local / lib / python3.6 / dist-packages / pyspark / sql / dataframe.py rs = self.head（1）
 1133 return rs [0] if rs else None 
-> 1134返回self.take（n）
 1135 
 1136 @ignore_unicode_prefix 
 
 /usr/local/lib/python3.6/dist -packages / pyspark / sql / dataframe.py in take（（self，num）
 502 [Row（age = 2，name = u'Alice'），Row（age = 5，name = u'Bob'） ] 
 503 
-> 504 return self.limit（num）.collect（）
 505 
 506 @since（1.3）
 
 /usr/local/lib/python3.6/dist-packages/ pyspark / sql / dataframe.py in collect（self）
 463 [Row（age = 2，name = u'Alice'），Row（age = 5，name = u'Bob'）] 
 464 
-> 465，其中SCCallSiteSync（self._sc）作为CSS：
 466 port = self._jdf.collectToPython（）
 467返回列表（_load_from_socket（port，BatchedSerializer （PickleSerializer（））））
 
 /usr/local/lib/python3.6/dist-packages/pyspark/traceback_utils.py in __enter __（self）
 70 def __enter __（self） ：
 71如果SCCallSiteSync._spark_stack_depth == 0：
 ---> 72 self._context._jsc.setCallSite（self._call_site）
 73 SCCallSiteSync._spark_stack_depth + = 1 
 74 
 
 AttributeError：'NoneType'对象没有属性'setCallSite'

有解决方案吗？

解决方案

以下是一个 ~~open~~ 已解决的问题：

https://issues.apache.org/jira/browse/SPARK-27335？ jql = text％20〜％20％22setcallsite％22

[注意：由于已解决，如果您使用的Spark版本比2019年10月，如果您仍然遇到此问题，请报告给Apache Jira]

张贴者建议强制将DF的后端与Spark上下文同步：

  df.sql_ctx.sparkSession._jsparkSession = spark._jsparkSession 
 df._sc = spark._sc

这对我们有用，希望在其他情况下也可以。

In PySpark, I want to calculate the correlation between two dataframe vectors, using the following code (I do not have any problem in importing pyspark or createDataFrame):

from pyspark.ml.linalg import Vectors
from pyspark.ml.stat import Correlation
import pyspark

spark = pyspark.sql.SparkSession.builder.master("local[*]").getOrCreate()

data = [(Vectors.sparse(4, [(0, 1.0), (3, -2.0)]),),
        (Vectors.dense([4.0, 5.0, 0.0, 3.0]),)]
df = spark.createDataFrame(data, ["features"])

r1 = Correlation.corr(df, "features").head()
print("Pearson correlation matrix:\n" + str(r1[0]))

But, I got the AttributeError (AttributeError: 'NoneType' object has no attribute 'setCallSite') as:

AttributeError                            Traceback (most recent call last)
<ipython-input-136-d553c1ade793> in <module>()
      6 df = spark.createDataFrame(data, ["features"])
      7 
----> 8 r1 = Correlation.corr(df, "features").head()
      9 print("Pearson correlation matrix:\n" + str(r1[0]))

/usr/local/lib/python3.6/dist-packages/pyspark/sql/dataframe.py in head(self, n)
   1130         """
   1131         if n is None:
-> 1132             rs = self.head(1)
   1133             return rs[0] if rs else None
   1134         return self.take(n)

/usr/local/lib/python3.6/dist-packages/pyspark/sql/dataframe.py in head(self, n)
   1132             rs = self.head(1)
   1133             return rs[0] if rs else None
-> 1134         return self.take(n)
   1135 
   1136     @ignore_unicode_prefix

/usr/local/lib/python3.6/dist-packages/pyspark/sql/dataframe.py in take(self, num)
    502         [Row(age=2, name=u'Alice'), Row(age=5, name=u'Bob')]
    503         """
--> 504         return self.limit(num).collect()
    505 
    506     @since(1.3)

/usr/local/lib/python3.6/dist-packages/pyspark/sql/dataframe.py in collect(self)
    463         [Row(age=2, name=u'Alice'), Row(age=5, name=u'Bob')]
    464         """
--> 465         with SCCallSiteSync(self._sc) as css:
    466             port = self._jdf.collectToPython()
    467         return list(_load_from_socket(port, BatchedSerializer(PickleSerializer())))

/usr/local/lib/python3.6/dist-packages/pyspark/traceback_utils.py in __enter__(self)
     70     def __enter__(self):
     71         if SCCallSiteSync._spark_stack_depth == 0:
---> 72             self._context._jsc.setCallSite(self._call_site)
     73         SCCallSiteSync._spark_stack_depth += 1
     74 

AttributeError: 'NoneType' object has no attribute 'setCallSite'

Any solution?

解决方案

There's an ~~open~~ resolved issue around this:

https://issues.apache.org/jira/browse/SPARK-27335?jql=text%20~%20%22setcallsite%22

[Note: as it's resolved, if you're using a more recent version of Spark than October 2019, please report to Apache Jira if you're still encountering this issue]

The poster suggests forcing to sync your DF's backend with your Spark context:

df.sql_ctx.sparkSession._jsparkSession = spark._jsparkSession
df._sc = spark._sc

This worked for us, hopefully can work in other cases as well.

这篇关于AttributeError：“ NoneType”对象没有属性“ setCallSite”的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

AttributeError：“ NoneType”对象没有属性“ setCallSite” [英] AttributeError: 'NoneType' object has no attribute 'setCallSite'

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

AttributeError：“ NoneType”对象没有属性“ setCallSite” [英] AttributeError: &#39;NoneType&#39; object has no attribute &#39;setCallSite&#39;

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

AttributeError：“ NoneType”对象没有属性“ setCallSite” [英] AttributeError: 'NoneType' object has no attribute 'setCallSite'

登录关闭