间歇性“getrandom()初始化失败";使用爬虫蜘蛛 [英] Intermittent "getrandom() initialization failed" using scrapy spider
问题描述
我构建了一个scrapy蜘蛛(scrapy 1.4).这个蜘蛛是通过 django-rq 和 supervisord 从 django 网站按需触发的.
I built a scrapy spider (scrapy 1.4). This spider is triggered on demand from a django website through django-rq and supervisord.
这里是监听 django-rq 事件的 supervisord 作业(reddit 用作代理)
Here is the supervisord job that is listening for django-rq events (reddit is used as broker)
[program:rq_worker]
command=python3 manage.py rqworker default
directory=/var/www/django-app
autostart=true
autorestart=true
stderr_logfile=/var/log/rq_worker.err.log
stdout_logfile=/var/log/rq_worker.out.log
此设置运行良好.但是,有时(我无法按需重现该问题),所有蜘蛛都会抛出相同的 OpenSSL 错误:
This set up is running fine. However, from time to time (I cannot reproduce the issue on demand), all the spiders throw the same OpenSSL error:
2018-02-11 11:02:19 [scrapy.core.scraper] ERROR: Error downloading <GET https://whateverwebsite.com>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
result = result.throwExceptionIntoGenerator(g)
File "/usr/local/lib/python3.5/dist-packages/twisted/python/failure.py", line 393, in throwExceptionIntoGenerator
return g.throw(self.type, self.value, self.tb)
File "/usr/local/lib/python3.5/dist-packages/scrapy/core/downloader/middleware.py", line 43, in process_request
defer.returnValue((yield download_func(request=request,spider=spider)))
twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure OpenSSL.SSL.Error: [('', 'osrandom_rand_bytes', 'getrandom() initialization failed.')]>]
重新启动 supervisord 使问题消失.
Restarting supervisord makes the issue disappearing.
为了确保我的网站及其蜘蛛正常运行,我必须在每次重新启动 supervisord 时进行测试,以确保没有问题.没什么大不了的,但仍然......
To make sure my website and its spiders are running properly I have to test each time supervisord is restarted that there is no issue. Not a big deal but still...
我想了解那里出了什么问题?如何解决此问题?与supervisord有关吗?扭曲相关?openSSL 相关吗?
I would like to understand what's going wrong there? How can I troubleshoot this issue? Is it supervisord related? Twisted related? openSSL related?
感谢您的帮助
推荐答案
我有类似的错误,但使用 python-requests 库:
I had similar error, but with python-requests library:
Error([('', 'osrandom_rand_bytes', 'getrandom() initialization failed.')],)
这是由于随机数生成器未能及时收集足够的熵造成的.我已经安装了 rng-tools 并且它解决了问题.
This was caused by random number generator that failed to gather enough entropy in time. I've installed rng-tools and it solved the problem.
这篇关于间歇性“getrandom()初始化失败";使用爬虫蜘蛛的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!