python - schedule内容不更新的问题。

查看：417 发布时间：2017/9/6 4:44:23 python pyspider

本文介绍了python - schedule内容不更新的问题。的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

问题

用的github上最新的0.3.9版本，发现更改了project的代码后，schedule里面的内容居然没有更新，导致本来希望半小时抓取一次，结果爬虫是10秒钟爬取一次。不知道是不是bug，怎么解决。

代码是这样

class Handler(BaseHandler):
    crawl_config = {
        'headers':{
            'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.89 Safari/537.36',
        }
}

    
    @every(minutes=30)
    def on_start(self):
        self.crawl('http://www.xxxx.org/', callback=self.index_page)

    @config(age=10)
    def index_page(self, response):

schedul是这样

注：原来有写itag，后来删除了。

ACTIVE xxxx.index_page > http://www.xxxx.org/ (8 seconds ago updated )

taskid
9dfac8d63cb01eae0e33701e26de4778
lastcrawltime
1480581196.0514488 (8 seconds ago)
updatetime
1480581196.0515082 (8 seconds ago)
exetime
1480581206.0514526 (1 second ago)
track.fetch  1320.64ms
{
  "content": null,
  "encoding": "GBK",
  "error": null,
  "headers": {},
  "ok": true,
  "redirect_url": null,
  "status_code": 200,
  "time": 1.3206377029418945
}
track.process  34.6ms +16
{
  "exception": null,
  "follows": 16,
  "logs": "",
  "ok": true,
  "result": null,
  "time": 0.03459787368774414
}
schedule
{
  "age": 10,
  "auto_recrawl": true,
  "exetime": 1480581206.0514526,
  "itag": "v223",
  "retried": 21
}
fetch
{}
process
{
  "callback": "index_page"
}

解决方案

你设置了 auto_recrawl，请通过 http://docs.pyspider.org/en/l... 取消

这篇关于python - schedule内容不更新的问题。的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

python - schedule内容不更新的问题。

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

python - schedule内容不更新的问题。

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭