如何使用 BeautifulSoup 和 Python 调用 JavaScript 函数 [英] How to call JavaScript function using BeautifulSoup and Python

查看:32
本文介绍了如何使用 BeautifulSoup 和 Python 调用 JavaScript 函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为项目的一部分,我正在执行网页抓取以从网站抓取数据.我可以发出请求并获取 dom 中存在的数据.但是,一些数据会在 javascript onClick 函数上呈现.

一种方法是,使用 selenium 点击链接(调用 javascript 函数)并抓取渲染的数据,但这个过程很耗时,我不想打开浏览器.

>

除了 selenium 之外还有什么方法可以实现吗?

网站:

您现在可以看到 JavaScript 从该 URL 下载新的 HTML.您可以使用 urllib 轻松发送相同的请求.

I am performing web scraping to grab data from a website as part of my project. I can make the request and grab the data which is present in the dom. However, some data is getting rendered on javascript onClick function.

One way could be, using the selenium to click on the link (which calls the javascript function) and grab the rendered data, but this process is time-consuming, and I don't want to open the browser.

Is there any way other than selenium to achieve this?

Website: http://catalog.fullerton.edu/preview_entity.php?catoid=16&ent_oid=1849

In the courses section of this webpage, all the courses are hyperlinks, and as soon as someone clicks on the courses, a javascript method gets called. I need the data which gets rendered after the javascript function call.

解决方案

You can't. If you want to run JavaScript, you'll need to use a headless browser. Otherwise, you'll have to disassemble the JavaScript and see what it does.

Click on the element while your browser's developer tools are open in the Network tab:

You can now see that the JavaScript downloads new HTML from that URL. You can easily send the same request with urllib.

这篇关于如何使用 BeautifulSoup 和 Python 调用 JavaScript 函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆