Scrapy 管道 spider_opened 和 spider_closed 未被调用 [英] Scrapy pipeline spider_opened and spider_closed not being called

查看:42
本文介绍了Scrapy 管道 spider_opened 和 spider_closed 未被调用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在使用爬虫管道时遇到了一些问题.我的信息正在从网站上被抓取,并且 process_item 方法被正确调用.然而,spider_opened 和 spider_closed 方法没有被调用.

I am having some trouble with a scrapy pipeline. My information is being scraped form sites ok and the process_item method is being called correctly. However the spider_opened and spider_closed methods are not being called.

class MyPipeline(object):

    def __init__(self):
        log.msg("Initializing Pipeline")
        self.conn = None
        self.cur = None

    def spider_opened(self, spider):
        log.msg("Pipeline.spider_opened called", level=log.DEBUG)

    def spider_closed(self, spider):
        log.msg("Pipeline.spider_closed called", level=log.DEBUG)

    def process_item(self, item, spider):
        log.msg("Processsing item " + item['title'], level=log.DEBUG)

__init__process_item 日志消息都显示在日志中,但是 spider_openspider_close 日志消息不是.

Both the __init__ and process_item logging messages are displyed in the log, but the spider_open and spider_close logging messages are not.

我需要使用 spider_opened 和 spider_closed 方法,因为我想使用它们来打开和关闭与数据库的连接,但日志中没有为它们显示任何内容.

I need to use the spider_opened and spider_closed methods as I want to use them to open and close a connection to a database, but nothing is showing up in the log for them.

如果有人有任何建议,那将非常有用.

If anyone has any suggested that would be very useful.

推荐答案

抱歉,刚发完这个就找到了.您必须添加:

Sorry, found it just after I posted this. You have to add:

dispatcher.connect(self.spider_opened, signals.spider_opened)
dispatcher.connect(self.spider_closed, signals.spider_closed)

__init__ 否则它永远不会收到调用它的信号

in __init__ otherwise it never receives the signal to call it

这篇关于Scrapy 管道 spider_opened 和 spider_closed 未被调用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆