为什么我在Scrapy中获取“ _SIGCHLDWaker”对象没有属性“ doWrite”? [英] Why I am Getting '_SIGCHLDWaker' object has no attribute 'doWrite' in Scrapy?
问题描述
我在芹菜中使用了Scrapy蜘蛛,并且随机得到这种错误
I am using Scrapy spiders inside Celery and I am getting this kind of errors randomly
Unhandled Error
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File "/usr/lib/python2.7/site-packages/twisted/python/log.py", line 86, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File "/usr/lib/python2.7/site-packages/twisted/python/context.py", line 122, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/usr/lib/python2.7/site-packages/twisted/python/context.py", line 85, in callWithContext
return func(*args,**kw)
--- <exception caught here> ---
File "/usr/lib/python2.7/site-packages/twisted/internet/posixbase.py", line 602, in _doReadOrWrite
why = selectable.doWrite()
exceptions.AttributeError: '_SIGCHLDWaker' object has no attribute 'doWrite'
我正在使用:
celery==3.1.19
Django==1.9.4
Scrapy==1.3.0
这是我在芹菜中运行Scrapy的方式:
This is how I run Scrapy inside Celery:
from billiard import Process
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings
class MyCrawlerScript(Process):
def __init__(self, **kwargs):
Process.__init__(self)
settings = get_project_settings('my_scraper')
self.crawler = CrawlerProcess(settings)
self.spider_name = kwargs.get('spider_name')
self.kwargs = kwargs
def run(self):
self.crawler.crawl(self.spider_name, qwargs=self.kwargs)
self.crawler.start()
def my_crawl_manager(**kwargs):
crawler = MyCrawlerScript(**kwargs)
crawler.start()
crawler.join()
在芹菜任务中,我正在打电话:
Inside a celery task, I am calling:
my_crawl_manager(spider_name='my_spider', url='www.google.com/any-url-here')
请知道为什么会这样吗?
Please any idea why this is happening?
PS:我问了另一个问题为什么我在Scrapy中遇到KeyError?我不知道它们是否在某种程度上相似
P.S: I have asked another question Why I am Getting KeyError in Scrapy? I don't know if they are somehow similar
推荐答案
我有同样的问题。我正在使用 asyncio
,多处理
,Twisted和Scrapy一起使用的复杂应用程序。
I had the same issue. I'm working within a complex application, using asyncio
, multiprocessing
, Twisted and Scrapy all together.
对我来说,解决方案是使用 asyncioreactor
,在 scrapy
:
The solution for me was to use asyncioreactor
, by installing the alternate reactor before any imports in scrapy
:
from twisted.internet import asyncioreactor
asyncioreactor.install()
from scrapy import stuff
# ...
这篇关于为什么我在Scrapy中获取“ _SIGCHLDWaker”对象没有属性“ doWrite”?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!