如何使用scrapy中的CrawlSpider点击一个带有javascript onclick的链接？ [英] How to use CrawlSpider from scrapy to click a link with javascript onclick?

查看：422 发布时间：2019/6/6 1:53:27 javascript python onclick scrapy web-scraping

本文介绍了如何使用scrapy中的CrawlSpider点击一个带有javascript onclick的链接？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我希望scrapy抓取页面进入下一个链接，如下所示：

I want scrapy to crawl pages where going on to the next link looks like this:

<a href="#" onclick="return gotoPage('2');"> Next </a>

scrapy能否解释其中的javascript代码？

Will scrapy be able to interpret javascript code of that?

使用 livehttpheaders 扩展名，我发现单击下一步会生成一个POST，其中包含一个非常大的垃圾，如下所示：

With livehttpheaders extension I found out that clicking Next generates a POST with a really huge piece of "garbage" starting like this:

encoded_session_hidden_map=H4sIAAAAAAAAALWZXWwj1RXHJ9n

我正在尝试构建我的蜘蛛在 CrawlSpider 类上，但我无法弄清楚如何编码，使用 BaseSpider 我使用了 parse（）处理第一个URL的方法，这恰好是一个登录表单，我用POST执行了POST：

I am trying to build my spider on the CrawlSpider class, but I can't really figure out how to code it, with BaseSpider I used the parse() method to process the first URL, which happens to be a login form, where I did a POST with:

def logon(self, response):
    login_form_data={ 'email': 'user@example.com', 'password': 'mypass22', 'action': 'sign-in' }
    return [FormRequest.from_response(response, formnumber=0, formdata=login_form_data, callback=self.submit_next)]

然后我定义了sub mit_next（）告诉下一步该怎么做。我无法弄清楚如何告诉CrawlSpider在第一个URL上使用哪种方法？

And then I defined submit_next() to tell what to do next. I can't figure out how do I tell CrawlSpider which method to use on the first URL?

我抓取的所有请求（第一个除外）都是POST请求。它们交替使用两种类型的请求：粘贴一些数据，然后单击下一步转到下一页。

All requests in my crawling, except the first one, are POST requests. They are alternating two types of requests: pasting some data, and clicking "Next" to go to the next page.

如何使用scrapy中的CrawlSpider点击一个带有javascript onclick的链接？ [英] How to use CrawlSpider from scrapy to click a link with javascript onclick?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录关闭

如何使用scrapy中的CrawlSpider点击一个带有javascript onclick的链接？ [英] How to use CrawlSpider from scrapy to click a link with javascript onclick?

问题描述

推荐答案

相关文章

前端开发最新文章

热门教程

热门工具

登录 关闭

登录关闭