'NoneType'对象在scrapy \ twisted \ openssl中没有属性'_app_data' [英] 'NoneType' object has no attribute '_app_data' in scrapy\twisted\openssl

查看:72
本文介绍了'NoneType'对象在scrapy \ twisted \ openssl中没有属性'_app_data'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用刮y的刮scrap过程中,我的日志中会不时出现一个错误. 它似乎不在我的代码中的任何地方,并且看起来像是twisted \ openssl中的某个东西. 任何想法是什么原因造成的,以及如何消除它?

During the scraping process using scrapy one error appears in my logs from time to time. It doesnt seem to be anywhere in my code, and looks like it something inside twisted\openssl. Any ideas what caused this and how to get rid of it?

此处的Stacktrace:

Stacktrace here:

[Launcher,27487/stderr] Error during info_callback
    Traceback (most recent call last):
      File "/opt/webapps/link_crawler/lib/python2.7/site-packages/twisted/protocols/tls.py", line 415, in dataReceived
        self._write(bytes)
      File "/opt/webapps/link_crawler/lib/python2.7/site-packages/twisted/protocols/tls.py", line 554, in _write
        sent = self._tlsConnection.send(toSend)
      File "/opt/webapps/link_crawler/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1270, in send
        result = _lib.SSL_write(self._ssl, buf, len(buf))
      File "/opt/webapps/link_crawler/lib/python2.7/site-packages/OpenSSL/SSL.py", line 926, in wrapper
        callback(Connection._reverse_mapping[ssl], where, return_code)
    --- <exception caught here> ---
      File "/opt/webapps/link_crawler/lib/python2.7/site-packages/twisted/internet/_sslverify.py", line 1055, in infoCallback
        return wrapped(connection, where, ret)
      File "/opt/webapps/link_crawler/lib/python2.7/site-packages/twisted/internet/_sslverify.py", line 1157, in _identityVerifyingInfoCallback
        transport = connection.get_app_data()
      File "/opt/webapps/link_crawler/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1589, in get_app_data
        return self._app_data
      File "/opt/webapps/link_crawler/lib/python2.7/site-packages/OpenSSL/SSL.py", line 1148, in __getattr__
        return getattr(self._socket, name)
    exceptions.AttributeError: 'NoneType' object has no attribute '_app_data'

推荐答案

乍一看,这似乎是由于scrapy中的错误所致. Scrapy定义了自己的Twisted上下文工厂": https://github.com/scrapy/scrapy/blob/ad36de4e6278cf635509a1ade30cca9a506da682/scrapy/core/downloader/contextfactory.py#L21-L28

At first glance, it appears as though this is due to a bug in scrapy. Scrapy defines its own Twisted "context factory": https://github.com/scrapy/scrapy/blob/ad36de4e6278cf635509a1ade30cca9a506da682/scrapy/core/downloader/contextfactory.py#L21-L28

此代码使用要返回的上下文实例化ClientTLSOptions.实例化此类的副作用是在上下文工厂上安装了信息回调".信息回调要求将Twisted TLS实施设置为连接上的应用程序数据".但是,由于没有任何人使用ClientTLSOptions实例(实例将立即丢弃),因此永远不会设置应用程序数据.

This code instantiates ClientTLSOptions with the context it intends to return. A side-effect of instantiating this class is that an "info callback" is installed on the context factory. The info callback requires that the Twisted TLS implementation has been set as "app data" on the connection. However, since nothing ever uses the ClientTLSOptions instance (it is discarded immediately), the app data is never set.

当信息回调返回以获取Twisted TLS实现(必须完成其部分工作)时,它会发现没有应用程序数据并因您报告的异常而失败.

When the info callback comes back around to get the Twisted TLS implementation (necessary to do part of its job) it instead finds there is no app data and fails with the exception you've reported.

ClientTLSOptions的副作用有点令人不愉快,但是我认为这显然是由于误用/滥用ClientTLSOptions引起的刮擦错误.我认为此代码不可能经过很好的测试,因为每次证书未能通过验证时都会发生此错误.

The side-effect of ClientTLSOptions is a little bit unpleasant but I think this is clearly a scrapy bug caused by mis-use/abuse of ClientTLSOptions. I don't think this code could ever have been very well tested since this error will happen every single time a certificate fails to verify.

我建议将错误报告给Scrapy.希望他们可以解决对ClientTLSOptions的使用问题,并为您消除此错误.

I suggest reporting the bug to Scrapy. Hopefully they can fix their use of ClientTLSOptions and eliminate this error for you.

这篇关于'NoneType'对象在scrapy \ twisted \ openssl中没有属性'_app_data'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆