单击网站上的按钮,然后抓取网页 [英] Click button on website then scrape web page

查看:48
本文介绍了单击网站上的按钮,然后抓取网页的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个网站,我想点击一个按钮,然后使用 python 抓取网站,按钮之间的 html 代码是:

I have a website I would like to click a button on then scrape the website using python the html code between the button is:

 <span id="exchange-testing" class="exchange-input nav-link" data track="&amp;lid=testing&amp;lpos=site_settings" data-value="testing">Testing</span>

这可能吗?我能够从页面上抓取我需要的所有数据,但我需要先单击按钮.

Is this possible? I am able to scrape all the data I need from the page but I need to click the button first.

任何帮助将不胜感激

推荐答案

基本上,您有两个选择:

Basically, you have two options:

  • 高级方法:使用 selenium 或者,换句话说,让浏览器重复所有用户操作,以访问具有所需数据的页面.

  • high-level approach: automate a real browser using selenium or, in other words, make the browser repeat all the user actions needed to get to the page with the desired data.

低级方法:当您单击按钮时,调查幕后发生的事情 - 浏览浏览器开发人员工具的网络"选项卡,看看正在发出什么请求.然后,在您的刮板中模拟它们.在这里,您可以考虑使用诸如 requestsmechanize 用于提出请求、处理抓取会话、提交表单等和诸如 BeautifulSouplxml.html 用于 html 解析.此外,Scrapy 网页抓取框架是必看的.

low-level approach: when you click the button, investigate what is happening under the hood - explore the "Network" tab of browser developer tools and see what requests are being made. Then, simulate them in your scraper. Here, you may consider using tools like requests, mechanize for making requests, handling scraping sessions, submitting forms etc and tools like BeautifulSoup, lxml.html for html parsing. Also, Scrapy web-scraping framework is a must see.

这篇关于单击网站上的按钮,然后抓取网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆