如何在python BeautifulSoup或任何其他模块中获取javascript输出 [英] How to get javascript output in python BeautifulSoup or any other module

查看:298
本文介绍了如何在python BeautifulSoup或任何其他模块中获取javascript输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我尝试制作刮刀时,我发现一个网站在其代码中使用了很多javascript,是否可以检索脚本的输出,例如

In my attempt to make a scraper, I found a website that uses javascript alot in its code, is it possible to retrieve the output of the script e.g.

<html>
<head>
<title>Python</title>
</head>
<body>
<script type="text/javascript" src='test.js'></script>
<p> some stuff <br>
more stuff <br>
code <br>
video <br>
picture <br>
movie <br>
. <br>
. <br>
. <br>
</p>
<span>Your Number is:  </span>
<script type="text/javascript">document.write(math(5, 10, 15));</script>
</body>
</html>

其中test.js有:

where "test.js" has:

function math (a, b, c) {return a * b * c * c * a * b * c + a + b +c - a;}

当我使用BeautifulSoup时,它会显示代码本身即:

When I use BeautifulSoup it shows the code itself i.e:

<script type="text/javascript">document.write(math(5, 10, 15));</script>

但是我需要得到你的号码是:8437480,我可以得到跨越的文本使用soup.span.get_text()然而我不能得到脚本的数量。

however i need to get "Your Number is: 8437480", i could get the text between span by using soup.span.get_text() however i cant get the number of the script.

推荐答案

Beautifulsoup只是无法执行javascript码。我建议你将 PhantomJS 等内容整合到你的剪贴板中。如果你可以删除python,你可以在PhantomJS

Beautifulsoup just can't execute javascript code. I suggest you to integrate something like PhantomJS into your scrapper. If you can drop python, you can write the whole scrapper in PhantomJS

这篇关于如何在python BeautifulSoup或任何其他模块中获取javascript输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆