使用 Python 请求模拟单击“显示更多"按钮 [英] Using Python Requests to simulate clicking a 'show more' button
问题描述
我不确定单击显示更多"按钮要使用什么代码.我想得到一份正在做特定主题的大学名单.下面是其中一个网站
http://www.sciencedirect.com/science/article/>
您的帮助将不胜感激
谢谢
您不应该在 Python 中模拟实际单击"显示更多"按钮来完成网页抓取.
网站中的显示更多"按钮通常与某些 JavaScript 相关联,这些 JavaScript 要么显示 HTML 中已有的隐藏元素(请参阅 Bootstrap 的 collapse
类 典型示例)或向某些 Web 服务发出请求(例如 REST API) 用于插入DOM.
无论哪种方式,您都可以抓取该数据.对于前者,在 DOM 中找到隐藏元素(查看页面的源代码 [Ctrl + U
] 并搜索 HTML [Ctrl + F
]),并使用您的典型网页抓取工具.对于后者,当您单击显示更多"并尝试使用 Python 复制该请求时,请使用类似 Google Dev Tools 的网络"选项卡来检查 API 请求.
在您给出的具体示例中,您想要的数据似乎作为 JSON 对象存储在 HTML 标记中.在 HTML 中搜索隶属关系"一词.
I am not sure what code to use for clicking the show more button. I want to get a list of university who are doing certain topic. below is one of the websites
http://www.sciencedirect.com/science/article/
your helps will be true appreciated
Thanks
You shouldn't have to simulate, in Python, an actual "click" of the "show more" button to accomplish web-scraping.
"Show more" buttons in websites are usually tied to some JavaScript that either reveals a hidden element already in the HTML (see Bootstrap's collapse
class for a typical example) or fires off a request to some web service (e.g. a REST API) for information to insert in the DOM.
Either way, you can scrape that data. For the former, find the hidden element in the DOM (view the page's source [Ctrl + U
] and search the HTML [Ctrl + F
]), and use your typical webscraping tools. For the latter, use something like Google Dev Tools' Network tab to inspect the API request when you click "show more" and then try to replicate that request with Python.
In the specific example you've given, it appears the data you want is stored in an HTML <script>
tag as a JSON object. Search the HTML for the word "affiliation".
这篇关于使用 Python 请求模拟单击“显示更多"按钮的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!