点击谷歌结果页面时HtmlUnit中的JavaScript异常 [英] JavaScript Exception in HtmlUnit when clicking at google result page

查看:907
本文介绍了点击谷歌结果页面时HtmlUnit中的JavaScript异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用HtmlUnit(v2.21)从google获取一些搜索结果页。这需要我在搜索一个人时点击人们也寻找链接(右侧,请参见示例链接),这会触发一些JavaScript并更改当前页面的内容。但这给我一个JavaScript Wrapper异常(见下文)。

I want to use HtmlUnit (v2.21) to get some search result pages from google. This requires me to click on "people also looked for" link when searching for a person (right side, see example link), which triggers some JavaScript and changes the content of the current page. But this gives me an JavaScript Wrapper Exception (see below).

可点击的示例链接: https://www.google.de/search?ie=UTF-8&safe=off&q=nicki+minaj

简单的TestCase错误:

Simple TestCase with errors:

String url = "https://www.google.de/search?ie=UTF-8&safe=off&q=nicki+minaj";
WebClient client = new WebClient(BrowserVersion.BEST_SUPPORTED);
HtmlPage page = client.getPage(url);
HtmlElement link = page.getFirstByXPath("//a[@class='_Zjg']");
HtmlPage newPage = link.click(); //throws exception
this.storeResultFile(newPage.asXml(), "test");
client.close();

结果:

net.sourceforge.htmlunit.corejs.javascript.WrappedException: Wrapped java.lang.NullPointerException
at net.sourceforge.htmlunit.corejs.javascript.Context.throwAsScriptRuntimeEx(Context.java:2053)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.doProcessPostponedActions(JavaScriptEngine.java:947)
at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.processPostponedActions(JavaScriptEngine.java:1012)
at com.gargoylesoftware.htmlunit.html.DomElement.click(DomElement.java:799)
at com.gargoylesoftware.htmlunit.html.DomElement.click(DomElement.java:742)
at com.gargoylesoftware.htmlunit.html.DomElement.click(DomElement.java:689)

我存储了xml的页面对象,并确保XPath表达式有效并具有结果。

I stored the xml of the "page" object and made sure that the XPath expression is valid and has results.

有人有任何想法吗?

推荐答案

引擎(基于R​​hino)非常容易令人不安,并退出某些脚本问题,其他浏览器仍然可以运行脚本。
我不知道google的脚本是否有错误,但是这两行解决了我的问题:

Looks like the JavaScript-Engine (based on Rhino) is very easy to upset and quits on some script-issues, where other browsers are still able to run the script. I dont know if there is a mistake in the scripts from google, but these two lines solved it for me:

JavaScriptEngine engine = client.getJavaScriptEngine();
engine.holdPosponedActions();

然而,当在多个线程中运行多个htmlunit对象时,仍然可以得到这个错误。这比解决方案更容易解决。

Nevertheless, when running multiple htmlunit-objects in multiple threads it is still possible to get accross this error. This is more a workaround than a solution.

这篇关于点击谷歌结果页面时HtmlUnit中的JavaScript异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆