点击网站上的按钮,然后刮掉网页 [英] Click button on website then scrape web page
问题描述
我有一个网站,我想点击一个按钮,然后使用python抓取网站的HTML代码之间的按钮是:
I have a website I would like to click a button on then scrape the website using python the html code between the button is:
<span id="exchange-testing" class="exchange-input nav-link" data track="&lid=testing&lpos=site_settings" data-value="testing">Testing</span>
这可能吗?我可以从页面中刮除所有需要的数据,但我需要先点击按钮。
Is this possible? I am able to scrape all the data I need from the page but I need to click the button first.
任何帮助都会感激。
推荐答案
选项:
-
高级方法:使用
selenium
,换句话说,make浏览器重复所有需要的用户操作以获得具有所需数据的页面。
high-level approach: automate a real browser using
selenium
or, in other words, make the browser repeat all the user actions needed to get to the page with the desired data.
低级方法:当您点击按钮时,的浏览器开发者工具,并看到正在做出什么样的请求。然后,在你的刮刀模拟它们。在这里,您可以考虑使用 请求
, mechanize
提出请求,处理抓取会话,提交表单等工具,如 BeautifulSoup
, lxml.html
用于html解析。此外, Scrapy
网络抓取框架是必须看到的。
low-level approach: when you click the button, investigate what is happening under the hood - explore the "Network" tab of browser developer tools and see what requests are being made. Then, simulate them in your scraper. Here, you may consider using tools like requests
, mechanize
for making requests, handling scraping sessions, submitting forms etc and tools like BeautifulSoup
, lxml.html
for html parsing. Also, Scrapy
web-scraping framework is a must see.
这篇关于点击网站上的按钮,然后刮掉网页的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!