用scrapy一一抓取站点列表 [英] crawl a list of sites one by one with scrapy
本文介绍了用scrapy一一抓取站点列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试使用 scrapy
抓取网站列表.我试图将网站 url 列表作为 start_urls
,但后来我发现我无法承受这么多内存.有没有办法设置scrapy
一次爬取一两个网站?
I am trying to crawl a list of sites with scrapy
. I tried to put the list of website urls as the start_urls
, but then I found I couldn't afford so much memory with it. Is there any way to set the scrapy
crawling one or two sites at a time?
推荐答案
您可以尝试使用 concurrent_requests = 1
以免数据过载
You can try using concurrent_requests = 1
so that you don't overloaded with data
http://doc.scrapy.org/en/latest/topics/settings.html#concurrent-requests
这篇关于用scrapy一一抓取站点列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文