修改scrapy中的CSV导出 [英] Modifiying CSV export in scrapy

查看:377
本文介绍了修改scrapy中的CSV导出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我似乎缺少一些很简单的东西。我想做的是在CSV导出器中使用; 作为
分隔符,而不是

I seem to be missing something very simple. All i want to do is use ; as a delimiter in the CSV exporter instead of ,.

我知道CSV导出程序将kwargs传递给csv writer,但我似乎不能通过
找出如何传递这个分隔符。

I know the CSV exporter passes kwargs to csv writer, but i cant seem to figure out how to pass this the delimiter.

我这样调用我的蜘蛛:

scrapy crawl spidername --set FEED_URI=output.csv --set FEED_FORMAT=csv 


推荐答案

$ c> contrib / feedexport.py ,

In contrib/feedexport.py,

class FeedExporter(object):

    ...

    def open_spider(self, spider):
        file = TemporaryFile(prefix='feed-')
        exp = self._get_exporter(file)  # <-- this is where the exporter is instantiated
        exp.start_exporting()
        self.slots[spider] = SpiderSlot(file, exp)

    def _get_exporter(self, *a, **kw):
        return self.exporters[self.format](*a, **kw)  # <-- not passed in :(

您需要自己创建,下面是一个示例:

You will need to make your own, here's an example:

from scrapy.conf import settings
from scrapy.contrib.exporter import CsvItemExporter


class CsvOptionRespectingItemExporter(CsvItemExporter):

    def __init__(self, *args, **kwargs):
        delimiter = settings.get('CSV_DELIMITER', ',')
        kwargs['delimiter'] = delimiter
        super(CsvOptionRespectingItemExporter, self).__init__(*args, **kwargs)

.py 文件,添加:

In the settings.py file of your crawler directory, add this:

FEED_EXPORTERS = {
    'csv': 'importable.path.to.CsvOptionRespectingItemExporter',
}

可以执行你的蜘蛛如下:

Now, you can execute your spider as follows:

scrapy crawl spidername --set FEED_URI=output.csv --set FEED_FORMAT=csv --set CSV_DELIMITER=';'

HTH。

这篇关于修改scrapy中的CSV导出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆