通过scrapy中的回调函数传递元元素 [英] Passing meta elements through callback function in scrapy

查看:32
本文介绍了通过scrapy中的回调函数传递元元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我通过回调函数传递元素时,就像在官方 scrapy 文档中找到的这个例子一样.

When I pass elements through callback function like in this example found on the official scrapy documentation.

我想知道传递给 parse_page2 的元素 item 在上述函数中修改后是否可以在 parge_page1 函数中检索修改.

I was wondering if the element item passed to parse_page2once modified inside the aforementioned function could be retrieved modified in the parge_page1 function.

我的意思是假设下面的例子.在 parse_page2 函数中,我们将 response.url 添加到other_url"字段中.

I mean assume the example below. In the parse_page2 function we add the response.url into the 'other_url' field.

parse_page2执行完成后,是否有办法在parse_page1中获取'other_url'?

Does it exist a way to get 'other_url' inside parse_page1after the execution of parse_page2 has completed?

def parse_page1(self, response):
    item = MyItem()
    item['main_url'] = response.url
    request = scrapy.Request("http://www.example.com/some_page.html",
                             callback=self.parse_page2)
    request.meta['item'] = item
    return request

def parse_page2(self, response):
    item = response.meta['item']
    item['other_url'] = response.url
    return item

推荐答案

您可以简单地在 meta dict 中传递 response.url,并在您的 parse_page2 函数中创建项目,而不是在 parse_page1 函数中创建您的项目.

Instead of creating your item in the parse_page1 function, you can simply pass the response.url in the meta dict, and create the item in your parse_page2 function.

def parse_page1(self, response):
    return Request(url="http://www.example.com/some_page.html",
                   meta={'main_url':reponse.url},
                   callback=self.parse_page2)

def parse_page2(self, response):
    item = MyItem()
    item['main_url'] = response.meta['main_url']
    item['other_url'] = response.url
    return item

或者,如果你真的想从 parse_page2 返回信息,你可以回调 parse_page1,并在你的函数中添加一个条件:

Or, if you really want to return the info from the parse_page2, you can callback parse_page1, and add a conditional in your function:

def parse_page1(self, response):
    if "other_url" in response.meta:
        do something
    else:
        do something else

这篇关于通过scrapy中的回调函数传递元元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆