记录在scrapy中运行蜘蛛所花费的总时间 [英] Recording the total time taken for running a spider in scrapy

查看:56
本文介绍了记录在scrapy中运行蜘蛛所花费的总时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用scrapy来废弃一个网站

我编写了一个蜘蛛程序并从页面中获取了所有项目并保存到一个 csv 文件中,现在我想保存scrapy运行蜘蛛文件所花费的总执行时间,实际上在蜘蛛执行完成后,当我们在终端时,它会显示一些结果像 starttime, endtime 等等......所以现在在我的程序中我需要计算scrapy运行蜘蛛所花费的总时间并将总时间存储在某个地方......>

谁能告诉我现在如何通过一个例子来做到这一点........

提前致谢...........

解决方案

这可能很有用:

from scrapy.xlib.pydispatch 导入调度器从scrapy导入信号从scrapy.stats 导入统计信息从日期时间导入日期时间def handle_spider_closed(蜘蛛,原因):打印 'Spider 关闭:', spider.name, stats.get_stats(spider)打印'工作时间:',datetime.now() - stats.get_stats(spider)['start_time']dispatcher.connect(handle_spider_closed,signals.spider_closed)

I am using scrapy to scrap a site

I had written a spider and fetched all the items from the page and saved to a csv file, and now i want to save the total execution time taken by scrapy to run the spider file, actually after spider execution is completed and when we have at at terminal it will display some results like starttime, endtime and so on .... so now in my program i need to calculate the total time taken by scrapy to run the spider and storing the total time some where....

Can anyone let me now how to do this through an example........

Thanks in advance...........

解决方案

This could be useful:

from scrapy.xlib.pydispatch import dispatcher
from scrapy import signals
from scrapy.stats import stats
from datetime import datetime

def handle_spider_closed(spider, reason):
    print 'Spider closed:', spider.name, stats.get_stats(spider)
    print 'Work time:', datetime.now() - stats.get_stats(spider)['start_time']


dispatcher.connect(handle_spider_closed, signals.spider_closed)

这篇关于记录在scrapy中运行蜘蛛所花费的总时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆