Python Scrapy:如何让 CSVItemExporter 以特定顺序写入列 [英] Python Scrapy: How to get CSVItemExporter to write columns in a specific order

查看:26
本文介绍了Python Scrapy:如何让 CSVItemExporter 以特定顺序写入列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Scrapy 中,我在 items.py 中按特定顺序指定了我的项目,&我的蜘蛛以相同的顺序再次拥有这些物品.然而,当我运行蜘蛛 &将结果保存为 csv,items.py 或蜘蛛中的列顺序不会得到维护.如何让 CSV 以特定顺序显示列.示例代码将不胜感激.

In Scrapy, I have my items specified in a certain order in items.py, & my spider has those items again in the same order. However, when I run the spider & save the results as a csv, the column order from the items.py or the spider is not maintained. How can I get the CSV to show columns in a specific order. Example code would be very appreciated.

谢谢.

推荐答案

这与Modifiying有关在scrapy中导出CSV

问题是exporter是在没有任何关键字参数的情况下实例化的,所以像EXPORT_FIELDS这样的关键字被忽略了.解决方法是一样的:需要对CSV item exporter进行子类化,传递关键字参数.

The problem is that the exporter is instantiated without any keyword parameters, so the keywords like EXPORT_FIELDS are ignored. The solution is the same: you need to subclass the CSV item exporter to pass the keyword parameters.

按照上述方法,我创建了一个新文件 xyzzy/feedexport.py(将xyzzy"更改为您的 scrapy 类的名称):

Following the above recipe, I created a new file xyzzy/feedexport.py (change "xyzzy" to whatever your scrapy class is named):

"""
The standard CSVItemExporter class does not pass the kwargs through to the
CSV writer, resulting in EXPORT_FIELDS and EXPORT_ENCODING being ignored
(EXPORT_EMPTY is not used by CSV).
"""

from scrapy.conf import settings
from scrapy.contrib.exporter import CsvItemExporter

class CSVkwItemExporter(CsvItemExporter):

    def __init__(self, *args, **kwargs):
        kwargs['fields_to_export'] = settings.getlist('EXPORT_FIELDS') or None
        kwargs['encoding'] = settings.get('EXPORT_ENCODING', 'utf-8')

        super(CSVkwItemExporter, self).__init__(*args, **kwargs)

然后将其添加到 xyzzy/settings.py:

and then added it into xyzzy/settings.py:

FEED_EXPORTERS = {
    'csv': 'xyzzy.feedexport.CSVkwItemExporter'
}

现在 CSV 导出器将遵循 EXPORT_FIELD 设置 - 也添加到 xyzzy/settings.py:

Now the CSV exporter will honor the EXPORT_FIELD setting - also add to xyzzy/settings.py:

# By specifying the fields to export, the CSV export honors the order
# rather than using a random order.
EXPORT_FIELDS = [
    'field1',
    'field2',
    'field3',
]

这篇关于Python Scrapy:如何让 CSVItemExporter 以特定顺序写入列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆