scrapy:当蜘蛛退出时调用一个函数 [英] scrapy: Call a function when a spider quits

查看:56
本文介绍了scrapy:当蜘蛛退出时调用一个函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法在 Spider 类中的方法终止前触发它?

Is there a way to trigger a method in a Spider class just before it terminates?

我可以自己终止蜘蛛,就像这样:

I can terminate the spider myself, like this:

class MySpider(CrawlSpider):
    #Config stuff goes here...

    def quit(self):
        #Do some stuff...
        raise CloseSpider('MySpider is quitting now.')

    def my_parser(self, response):
        if termination_condition:
            self.quit()

        #Parsing stuff goes here...

但我找不到任何关于如何确定蜘蛛何时自然退出的信息.

But I can't find any information on how to determine when the spider is about to quit naturally.

推荐答案

貌似可以通过dispatcher注册一个信号监听器.

It looks like you can register a signal listener through dispatcher.

我会尝试类似的东西:

from scrapy import signals
from scrapy.xlib.pydispatch import dispatcher

class MySpider(CrawlSpider):
    def __init__(self):
        dispatcher.connect(self.spider_closed, signals.spider_closed)

    def spider_closed(self, spider):
      # second param is instance of spder about to be closed.

<小时>

在较新版本的scrapy scrapy.xlib.pydispatch 中已弃用.相反,您可以使用 from pydispatch import dispatcher.


In the newer version of scrapy scrapy.xlib.pydispatch is deprecated. instead you can use from pydispatch import dispatcher.

这篇关于scrapy:当蜘蛛退出时调用一个函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆