Scrapy 关注 &抓取下一页 [英] Scrapy Follow & Scrape next Pages

查看：59 发布时间：2021/7/16 22:18:45 python python-2.7 web-scraping scrapy

本文介绍了Scrapy 关注 &抓取下一页的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我遇到了一个问题，我的爬虫蜘蛛都不会抓取网站，只会抓取一页并抓取.我的印象是 rules 成员变量对此负责，但我无法让它跟随任何链接.我一直在关注这里的文档:http://doc.scrapy.org/en/latest/topics/spiders.html#crawlspider

I am having a problem where none of my scrapy spiders will crawl a website, just scrape one page and seize. I was under the impression that the rules member variable was responsible for this, but I can't get it to follow any links. I have been following the documentation from here: http://doc.scrapy.org/en/latest/topics/spiders.html#crawlspider

是什么让我的机器人不爬行?

What could I be missing that is making none of my bots crawl?

from scrapy.contrib.spiders import CrawlSpider, Rule
from scrapy.contrib.linkextractors import LinkExtractor
from scrapy.selector import Selector

from Example.items import ExItem

class ExampleSpider(CrawlSpider):
    name = "example"
    allowed_domains = ["example.ac.uk"]
    start_urls = (
        'http://www.example.ac.uk',
    )

    rules = ( Rule (LinkExtractor(allow=("", ),),
                    callback="parse_items",  follow= True),
    )

Scrapy 关注 &抓取下一页 [英] Scrapy Follow & Scrape next Pages

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Scrapy 关注 &amp;抓取下一页 [英] Scrapy Follow &amp; Scrape next Pages

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

Scrapy 关注 &抓取下一页 [英] Scrapy Follow & Scrape next Pages

登录关闭