Scrapy-激活项目管道组件-ITEM_PIPELINES设置 [英] Scrapy - Activating an Item Pipeline component - ITEM_PIPELINES setting

查看:629
本文介绍了Scrapy-激活项目管道组件-ITEM_PIPELINES设置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在scrapy文档中有以下信息:

In scrapy documentation there is this information:

激活项目管道组件

要激活Item Pipeline组件,必须将其类添加到 ITEM_PIPELINES设置,如以下示例所示:

To activate an Item Pipeline component you must add its class to the ITEM_PIPELINES setting, like in the following example:

ITEM_PIPELINES = { 'myproject.pipelines.PricePipeline':300, 'myproject.pipelines.JsonWriterPipeline':800,}

ITEM_PIPELINES = { 'myproject.pipelines.PricePipeline': 300, 'myproject.pipelines.JsonWriterPipeline': 800, }

您在此设置中分配给类的整数值确定 他们运行物料的订单通过从低至低的订单流水线 高的.通常将这些数字定义在0-1000范围内.

The integer values you assign to classes in this setting determine the order they run in- items go through pipelines from order number low to high. It’s customary to define these numbers in the 0-1000 range.

我不理解最后一段,主要是确定 他们运行物料的订单通过从低至低的订单流水线 高",可以换句话解释吗?之所以选择数字是因为?在0-1000的范围内如何选择这些值?

I do not understand the last paragraph, mainly "determine the order they run in- items go through pipelines from order number low to high", can you explain in other words? that numbers are chosen because of what? in the range is 0-1000 how to choose the values?

推荐答案

由于Python中的字典是无序集合,并且ITEM_PIPELINES必须是字典(与许多其他设置一样,例如SPIDER_MIDDLEWARES),您需要以某种方式定义应用管道的顺序.这就是为什么您需要为定义的每个管道分配一个0到1000之间的数字.

Since a dictionary in Python is an unordered collection and ITEM_PIPELINES has to be a dictionary (as a lot of other settings, like, for example, SPIDER_MIDDLEWARES), you need to, somehow, define an order in which pipelines are applied. This is why you need to assign a number from 0 to 1000 to each pipeline you define.

仅供参考,如果您查看Scrapy来源,则会发现

FYI, if you look into Scrapy source, you'll find build_component_list() function which is called for each setting like ITEM_PIPELINES - it makes a list (ordered collection) out of the dictionary you define in ITEM_PIPELINES using dictionary values for sorting:

def build_component_list(base, custom):
    """Compose a component list based on a custom and base dict of components
    (typically middlewares or extensions), unless custom is already a list, in
    which case it's returned.
    """
    if isinstance(custom, (list, tuple)):
        return custom
    compdict = base.copy()
    compdict.update(custom)
    items = (x for x in six.iteritems(compdict) if x[1] is not None)
    return [x[0] for x in sorted(items, key=itemgetter(1))]

这篇关于Scrapy-激活项目管道组件-ITEM_PIPELINES设置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆