从scrapy导出csv文件(不是通过命令行) [英] Export csv file from scrapy (not via command line)
问题描述
我成功地尝试从命令行将我的项目导出到 csv 文件中,例如:
I successfully tried to export my items into a csv file from the command line like:
scrapy crawl spiderName -o filename.csv
我的问题是:在代码中执行相同操作的最简单解决方案是什么?我需要这个,因为我从另一个文件中提取文件名.结束场景应该是,我称之为
My question is: What is the easiest solution to do the same in the code? I need this as i extract the filename from another file. End scenario should be, that i call
scrapy crawl spiderName
并将项目写入 filename.csv
and it writes the items into filename.csv
推荐答案
为什么不使用项目管道?
Why not use an item pipeline?
WriteToCsv.py
WriteToCsv.py
import csv
from YOUR_PROJECT_NAME_HERE import settings
def write_to_csv(item):
writer = csv.writer(open(settings.csv_file_path, 'a'), lineterminator='
')
writer.writerow([item[key] for key in item.keys()])
class WriteToCsv(object):
def process_item(self, item, spider):
write_to_csv(item)
return item
settings.py
settings.py
ITEM_PIPELINES = { 'project.pipelines_path.WriteToCsv.WriteToCsv' : A_NUMBER_HIGHER_THAN_ALL_OTHER_PIPELINES}
csv_file_path = PATH_TO_CSV
如果您希望将项目写入单独的 csv 以供单独的蜘蛛使用,您可以为蜘蛛提供一个 CSV_PATH 字段.然后在您的管道中使用您的蜘蛛字段而不是来自 setttigs 的路径.
If you wanted items to be written to separate csv for separate spiders you could give your spider a CSV_PATH field. Then in your pipeline use your spiders field instead of path from setttigs.
这是可行的,我在我的项目中对其进行了测试.
This works I tested it in my project.
HTH
http://doc.scrapy.org/en/latest/topics/item-pipeline.html
这篇关于从scrapy导出csv文件(不是通过命令行)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!