单击网站上的按钮,然后抓取网页 [英] Click button on website then scrape web page
问题描述
我有一个网站,我想点击一个按钮,然后使用 python 抓取网站,按钮之间的 html 代码是:
I have a website I would like to click a button on then scrape the website using python the html code between the button is:
<span id="exchange-testing" class="exchange-input nav-link" data track="&lid=testing&lpos=site_settings" data-value="testing">Testing</span>
这可能吗?我能够从页面上抓取我需要的所有数据,但我需要先单击按钮.
Is this possible? I am able to scrape all the data I need from the page but I need to click the button first.
任何帮助将不胜感激
推荐答案
基本上,您有两个选择:
Basically, you have two options:
高级方法:使用
selenium
或者,换句话说,让浏览器重复所有用户操作,以访问具有所需数据的页面.
high-level approach: automate a real browser using
selenium
or, in other words, make the browser repeat all the user actions needed to get to the page with the desired data.
低级方法:当您单击按钮时,调查幕后发生的事情 - 浏览浏览器开发人员工具的网络"选项卡,看看正在发出什么请求.然后,在您的刮板中模拟它们.在这里,您可以考虑使用诸如 requests
、mechanize
用于提出请求、处理抓取会话、提交表单等和诸如 BeautifulSoup
、lxml.html
用于 html 解析.此外,Scrapy
网页抓取框架是必看的.
low-level approach: when you click the button, investigate what is happening under the hood - explore the "Network" tab of browser developer tools and see what requests are being made. Then, simulate them in your scraper. Here, you may consider using tools like requests
, mechanize
for making requests, handling scraping sessions, submitting forms etc and tools like BeautifulSoup
, lxml.html
for html parsing. Also, Scrapy
web-scraping framework is a must see.
这篇关于单击网站上的按钮,然后抓取网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!