基于Javascript的动态内容使用htmlUnit [英] Javascript based dynamic content using htmlUnit
本文介绍了基于Javascript的动态内容使用htmlUnit的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我一直坚持使用HtmlUnit获取基于JavaScript的动态内容。我期待从页面获得(Signin,注册html内容)。使用以下代码,我只获取静态内容。
I have been stuck in getting JavaScript based dynamic content using HtmlUnit. I am expecting to get (Signin, Registration html content) from the page. With the following code, I only get the static content.
我是HtmlUnit的新手。任何帮助都将受到高度赞赏。
I am new to HtmlUnit. Any help will be highly appreciated.
String strURL = "https://www.checkmytrip.com" ;
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.OFF);
java.util.logging.Logger.getLogger("org.apache.http").setLevel(java.util.logging.Level.OFF);
final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_31);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getCookieManager().setCookiesEnabled(true);
webClient.waitForBackgroundJavaScript(60 * 1000);
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage myPage = ((HtmlPage) webClient.getPage(strURL));
String theContent = myPage.getWebResponse().getContentAsString();
System.out.println(theContent);
推荐答案
两点:
- 获取页面后需要waitForBackgroundJavaScript(),如提示这里
-
您应该使用myPage.asText()或.asXml()代替,因为getWebResponse()返回原始内容而不执行JavaScript 。
- You need to waitForBackgroundJavaScript() after you get the page, as hinted here
You should use myPage.asText() or .asXml() instead, because getWebResponse() returns the original content without JavaScript execution.
String strURL = "https://www.checkmytrip.com" ;
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(java.util.logging.Level.OFF);
java.util.logging.Logger.getLogger("org.apache.http").setLevel(java.util.logging.Level.OFF);
try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX_31)) {
webClient.setAjaxController(new NicelyResynchronizingAjaxController());
HtmlPage myPage = ((HtmlPage) webClient.getPage(strURL));
webClient.waitForBackgroundJavaScript(10 * 1000);
String theContent = myPage.asXml();
System.out.println(theContent);
}
这篇关于基于Javascript的动态内容使用htmlUnit的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文