试图解析javascript隐藏的HTML [英] Trying to parse html hidden by javascript

查看:98
本文介绍了试图解析javascript隐藏的HTML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个使用Jsoup解析一页数据的简单Java脚本。然而,网站创建者已经改变了页面,如果网页上有一定数量的数据,它会给你提供改进搜索的意见,或者,你可以点击一个链接,数据就会出现。我一直在试图找到一个解决方案,我的头发已经撕掉了,URL不会改变,链接的href只是> javacript:void(0); 。有没有什么办法可以使用我的脚本获取包含数据的html?

解决方案

尝试使用驱动网页浏览器如 Selenium 。这是我使用过的唯一一个,从不需要别的东西。我确定有不同的东西可以更好地适应你,你应该测试一些,或者不要。一旦你用selenium(或者你选择的任何web驱动程序)获得javascript元素,将它们解析成JSoup Elements。
这样您就不必完全更改库文件,只需添加一个即可。



另外,您可以通过观察哪些更改在浏览器的地址栏中。


I've created a simple java script that used Jsoup to parse a page of data. The site creators have changed the page however, so much that if there is a certain amount of data on the page it gives you the opinion to refine your search, or, you can click a link and the data will come up. I've been tearing my hair out trying to find a solution, the url doesn't change, and the href for the link is just javacript:void(0);. Is there any way I can get at the html containing the data just using my script?

解决方案

Try to use something that drives a web browser like Selenium. That's the only one I have used, never needed anything else. I'm sure there are different ones that may suit you better, you should test a few, or not.. Once you get the javascript elements with selenium (or whatever web driver you choose) parse them into JSoup Elements. This way you wouldn't have to completely change libs, but just add one.

Also, there are ways you can work around javascript by watching what changes in browser's address bar.

这篇关于试图解析javascript隐藏的HTML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆