单击按钮后如何抓取数据 [英] How to scrape data after clicking button

查看:65
本文介绍了单击按钮后如何抓取数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用漂亮的汤从网站上抓取数据,但是要抓取所有内容,我必须单击按钮

I am trying to scrape data from website with beautiful soup, but to scrape all content, I have to click button

<button class="show-more">view all 102 items</button>

加载每个项目.我听说它可以用硒来完成,但这意味着我必须用脚本打开浏览器,然后抓取数据.还有其他方法可以解决此问题.

to load every item. I have heard that it could by done with selenium, but it means that i have to open browser with script, and then scrape the data. Are there any other ways to solve this problem.

推荐答案

您可以使用页面执行的相同API终结点,该终结点以json形式返回所有信息.将记录返回计数设置为高于期望的总数.我展示了从json解析出专辑标题/URL.您可以在此处探索响应.刷新提供的网址时,您可以在浏览器网络标签中找到此端点.

You can use the same API endpoint the page does which returns all the info in json form. Set a records return count higher than the total expected number. I show parsing out the album titles/urls from the json. You can explore response here. You can find this endpoint in the browser network tab when refreshing the url you supplied.

import requests

data = {"fan_id":1812622,"older_than_token":"1557167238:2897209009:a::","count":1000}
r = requests.post('https://bandcamp.com/api/fancollection/1/wishlist_items', json = data).json()
details = [(item['album_title'], item['item_url']) for item in r['items']]
print(details)

这篇关于单击按钮后如何抓取数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆