跳过HTML单元中的特定Javascript执行 [英] Skip particular Javascript execution in HTML unit

查看:50
本文介绍了跳过HTML单元中的特定Javascript执行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个URL.我想在执行Java脚本后获取URL的Page-Source.

I have a URL. I want to fetch Page-Source of the URL after executing Java Scripts.

使用HtmlUnit获取页面源:URL卡住了

最初,我怀疑是由于系统资源和CPU使用率过高导致URL被卡住.

Initially I suspected that it is due to system resource and High CPU usage, that the URL is getting stuck.

然后我尝试在HTML UNIT 2.9和2.11上运行它.解析时卡在两者上.请参阅上面的问题,以了解被卡住的HTML UNIT代码.

Then I tried to run it on HTML UNIT 2.9 and 2.11. It got stuck on both while parsing. Refer the above question for HTML UNIT code scrape that is getting stuck.

现在,我怀疑这可能是由于JS执行进入无限循环所致.

Now I am suspecting that this might be due to JS Execution going into infinite loop.

我想检查导致问题的JS文件并将其从执行中删除.

I want to check what JS files are causing problem and remove them from execution.

如果它们是用于Google Analytics(分析),Twitter等网站的JS,我可能根本不需要它们.

If they are JS for sites like google analytics, twitter etc, I may not need them at all.

因此,我想找到一种方法来告诉HTML Unit忽略某些JS文件并执行其余的文件.

So I want to find a way to tell HTML Unit to ignore certain JS file and execute the rest.

有人知道怎么做吗?

推荐答案

尝试一下.它对我有用:

Try this. It worked for me:

class InterceptWebConnection extends FalsifyingWebConnection{
    public InterceptWebConnection(WebClient webClient) throws IllegalArgumentException{
        super(webClient);
    }
    @Override
    public WebResponse getResponse(WebRequest request) throws IOException {
        WebResponse response=super.getResponse(request);
        if(response.getWebRequest().getUrl().toString().endsWith("dom-drag.js")){
            return createWebResponse(response.getWebRequest(), "", "application/javascript", 200, "Ok");
        }
        return super.getResponse(request);
    }
}

然后在设置webClient

new InterceptWebConnection(webClient);

这篇关于跳过HTML单元中的特定Javascript执行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆