用scrapy一一抓取站点列表 [英] crawl a list of sites one by one with scrapy

查看:34
本文介绍了用scrapy一一抓取站点列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 scrapy 抓取网站列表.我试图将网站 url 列表作为 start_urls,但后来我发现我无法承受这么多内存.有没有办法设置scrapy一次爬取一两个网站?

I am trying to crawl a list of sites with scrapy. I tried to put the list of website urls as the start_urls, but then I found I couldn't afford so much memory with it. Is there any way to set the scrapy crawling one or two sites at a time?

推荐答案

您可以尝试使用 concurrent_requests = 1 以免数据过载

You can try using concurrent_requests = 1 so that you don't overloaded with data

http://doc.scrapy.org/en/latest/topics/settings.html#concurrent-requests

这篇关于用scrapy一一抓取站点列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆