为什么我在Scrapy中获取“ _SIGCHLDWaker”对象没有属性“ doWrite”？ [英] Why I am Getting '_SIGCHLDWaker' object has no attribute 'doWrite' in Scrapy?

查看：144 发布时间：2020/10/23 2:26:57 python django scrapy celery twisted

本文介绍了为什么我在Scrapy中获取“ _SIGCHLDWaker”对象没有属性“ doWrite”？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在芹菜中使用了Scrapy蜘蛛，并且随机得到这种错误

I am using Scrapy spiders inside Celery and I am getting this kind of errors randomly

Unhandled Error
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/twisted/python/log.py", line 103, in callWithLogger
    return callWithContext({"system": lp}, func, *args, **kw)
  File "/usr/lib/python2.7/site-packages/twisted/python/log.py", line 86, in callWithContext
    return context.call({ILogContext: newCtx}, func, *args, **kw)
  File "/usr/lib/python2.7/site-packages/twisted/python/context.py", line 122, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/usr/lib/python2.7/site-packages/twisted/python/context.py", line 85, in callWithContext
    return func(*args,**kw)
--- <exception caught here> ---
  File "/usr/lib/python2.7/site-packages/twisted/internet/posixbase.py", line 602, in _doReadOrWrite
    why = selectable.doWrite()
exceptions.AttributeError: '_SIGCHLDWaker' object has no attribute 'doWrite'

我正在使用：

celery==3.1.19
Django==1.9.4
Scrapy==1.3.0

这是我在芹菜中运行Scrapy的方式：

This is how I run Scrapy inside Celery:

from billiard import Process
from scrapy.crawler import CrawlerProcess
from scrapy.utils.project import get_project_settings

class MyCrawlerScript(Process):
    def __init__(self, **kwargs):
        Process.__init__(self)
        settings = get_project_settings('my_scraper')
        self.crawler = CrawlerProcess(settings)
        self.spider_name = kwargs.get('spider_name')
        self.kwargs = kwargs

    def run(self):
        self.crawler.crawl(self.spider_name, qwargs=self.kwargs)
        self.crawler.start()

def my_crawl_manager(**kwargs):
    crawler = MyCrawlerScript(**kwargs)
    crawler.start()
    crawler.join()

在芹菜任务中，我正在打电话：

Inside a celery task, I am calling:

my_crawl_manager(spider_name='my_spider', url='www.google.com/any-url-here')

请知道为什么会这样吗？

Please any idea why this is happening?

PS：我问了另一个问题为什么我在Scrapy中遇到KeyError？我不知道它们是否在某种程度上相似

P.S: I have asked another question Why I am Getting KeyError in Scrapy? I don't know if they are somehow similar

推荐答案

我有同样的问题。我正在使用 asyncio ，多处理，Twisted和Scrapy一起使用的复杂应用程序。

I had the same issue. I'm working within a complex application, using asyncio, multiprocessing, Twisted and Scrapy all together.

对我来说，解决方案是使用 asyncioreactor ，在 scrapy ：

The solution for me was to use asyncioreactor, by installing the alternate reactor before any imports in scrapy:

from twisted.internet import asyncioreactor
asyncioreactor.install()

from scrapy import stuff
# ...

这篇关于为什么我在Scrapy中获取“ _SIGCHLDWaker”对象没有属性“ doWrite”？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

为什么我在Scrapy中获取“ _SIGCHLDWaker”对象没有属性“ doWrite”？ [英] Why I am Getting '_SIGCHLDWaker' object has no attribute 'doWrite' in Scrapy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

为什么我在Scrapy中获取“ _SIGCHLDWaker”对象没有属性“ doWrite”？ [英] Why I am Getting &#39;_SIGCHLDWaker&#39; object has no attribute &#39;doWrite&#39; in Scrapy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

为什么我在Scrapy中获取“ _SIGCHLDWaker”对象没有属性“ doWrite”？ [英] Why I am Getting '_SIGCHLDWaker' object has no attribute 'doWrite' in Scrapy?

登录关闭