使用HtmlUnit进行更快的页面处理 [英] Faster page processing with HtmlUnit

查看:141
本文介绍了使用HtmlUnit进行更快的页面处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

到目前为止,我有一个使用HtmlUnit来获取页面asXML

So far I have a working code that use HtmlUnit to get a page asXML

但是,我发现它正在处理页面上的所有内容,包括冲击波闪光对象.这会使处理速度变慢.

However, I find it that, it is processing everything on the page including shockwave flash objects. Which makes the processing slow.

我只需要它来处理纯HTML和Javascript,这样它就会更快.

I just need it to process, the plain HTML and Javascript, so that it will be faster.

这是我的代码:

        HtmlPage page = webClient.getPage(sb.toString());
        webClient.getJavaScriptEngine().pumpEventLoop(PUMP_TIME);
        pageString = page.asXml();

page.asXml()相当慢,也许是由于我上面提到的几点?

page.asXml() is quite slow, maybe because of the points I stated above?

是否有一种方法告诉HtmlUnit不要处理页面中不必要的部分?

Is there a way to tell HtmlUnit not to process unecessary parts of the page?

在这里我看到页面处理停留了很多时间(很多次):

This is where I see that the page processing stuck up for quite some time (many times):

[INFO] SEVERE: runtimeError: message=[Automation server can't create object for 'ShockwaveFlash.ShockwaveFlash'.] sourceName=[http://partner.googleadservices.com/gampad/google_ads_gpt.js] line=[9] lineSource=[null] lineOffset=[0]

  • HtmlUnit还会在内存中加载CSS和图像吗?
  • 推荐答案

    HtmlUnit无法处理闪存.但是,处理JS确实需要很多时间. JS可能从网上获得了一些东西,这也花费了更多时间.无论如何,请注意,日志实际上是INFO而不是SEVERE,并且基本上是在告诉您它没有创建任何Flash对象.

    HtmlUnit can't process flash. It does take a lot of time to process JS, though. Probably, the JS is getting something from the net and that is also taking more time. Anyway, note that the log is actually an INFO and not a SEVERE and basically it is telling you that it is not creating any flash object.

    如果可能的话,我建议您避免处理JS.

    I would recommend you to avoid the processing of JS, if possible.

    这篇关于使用HtmlUnit进行更快的页面处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆