如何在 Scrapy 中设置 Item.Field() 的默认值? [英] How to set the default value of an Item.Field() in Scrapy?

查看：369 发布时间：2021/7/16 22:08:13 python scrapy

本文介绍了如何在 Scrapy 中设置 Item.Field() 的默认值?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试抓取一个网站，该网站在页面与页面之间不显示相同的数据.我希望我的蜘蛛为其无法抓取的每个属性返回一个默认值.我知道这可以在项目声明中完成，如下所示:

I'm trying to scrape a website which does not display the same data from page to page. I'd like my spider to return a default value for each attribute it could not scrape. I know that this could be done in the item declaration like this :

class MyItem(scrapy.Item):
     myfield = scrapy.Field(default='NULL')

但是，这种方法似乎不再起作用(我使用的是 Scrapy 1.3.0).如果我在未找到该值时尝试导出此特定字段，则会得到:

However, this method seems not to work anymore (I'm using Scrapy 1.3.0). If I try to export this particular field when the value has not been found I got :

KeyError: 'myfield'

有解决方法吗?

推荐答案

大约 4 年前从 Scrapy 中删除了对字段默认值的支持(我只是好奇你以前使用过哪个版本).根据 Pablo Hoffman 推荐的方法是通过管道使用默认值填充项目:

Support of default values for fields was removed from Scrapy about 4 years ago (I'm just curious about which version have you used previously). According to Pablo Hoffman recommended way is to populate items with default values through pipeline:

class DefaultValuesPipeline(object):

    def process_item(self, item, spider):
        item.setdefault('field1', 'value1')
        item.setdefault('field2', 'value2')
        # ...
        return item

https://groups.google.com/d/msg/scrapy-users/-v1p5W41VDQ/0W9SIB07iDIJ

但是，您可以扩展默认的 Field 类来实现所需的行为.

However you can just extend default Field class to implement desired behavior.

这篇关于如何在 Scrapy 中设置 Item.Field() 的默认值?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在 Scrapy 中设置 Item.Field() 的默认值? [英] How to set the default value of an Item.Field() in Scrapy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在 Scrapy 中设置 Item.Field() 的默认值? [英] How to set the default value of an Item.Field() in Scrapy?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭