在scrapy中修改CSV导出 [英] Modifiying CSV export in scrapy
本文介绍了在scrapy中修改CSV导出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我似乎遗漏了一些非常简单的东西.我想要做的就是使用 ;
作为CSV 导出器中的分隔符,而不是 ,
.
I seem to be missing something very simple. All i want to do is use ;
as a
delimiter in the CSV exporter instead of ,
.
我知道 CSV 导出器将 kwargs 传递给 csv writer,但我似乎无法弄清楚如何传递这个分隔符.
I know the CSV exporter passes kwargs to csv writer, but i cant seem to figure out how to pass this the delimiter.
我这样称呼我的蜘蛛:
scrapy crawl spidername --set FEED_URI=output.csv --set FEED_FORMAT=csv
推荐答案
在 contrib/feedexport.py
,
class FeedExporter(object):
...
def open_spider(self, spider):
file = TemporaryFile(prefix='feed-')
exp = self._get_exporter(file) # <-- this is where the exporter is instantiated
exp.start_exporting()
self.slots[spider] = SpiderSlot(file, exp)
def _get_exporter(self, *a, **kw):
return self.exporters[self.format](*a, **kw) # <-- not passed in :(
您需要自己制作,这是一个示例:
You will need to make your own, here's an example:
from scrapy.conf import settings
from scrapy.contrib.exporter import CsvItemExporter
class CsvOptionRespectingItemExporter(CsvItemExporter):
def __init__(self, *args, **kwargs):
delimiter = settings.get('CSV_DELIMITER', ',')
kwargs['delimiter'] = delimiter
super(CsvOptionRespectingItemExporter, self).__init__(*args, **kwargs)
在你的爬虫目录的settings.py
文件中,添加:
In the settings.py
file of your crawler directory, add this:
FEED_EXPORTERS = {
'csv': 'importable.path.to.CsvOptionRespectingItemExporter',
}
现在,您可以按如下方式执行您的蜘蛛:
Now, you can execute your spider as follows:
scrapy crawl spidername --set FEED_URI=output.csv --set FEED_FORMAT=csv --set CSV_DELIMITER=';'
HTH.
这篇关于在scrapy中修改CSV导出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文