加载网页,执行其JavaScript并将生成的HTML转储到文件中 [英] Load a web page, execute its JavaScript and dump resulting HTML to a file

查看:79
本文介绍了加载网页,执行其JavaScript并将生成的HTML转储到文件中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要加载一个网页,执行它的JavaScript(以及标签中包含的所有js文件)并将生成的HTLM转储到文件中。这需要在服务器上完成。我已经尝试过使用zombie.js的node.js,但似乎在现实世界中工作太不成熟了。真的浏览器(FireFox)对页面没有任何问题时,通常会抛出一个虚假的异常。

I need to load a web page, execute its JavaScript (and all js files included with the tags) and dump resulting HTLM to a file. This needs to be done on the server. I have tried node.js with zombie.js but it seems it is too immature to work in the real world. More often than not it just throws a bogus exception while a real browser (FireFox) has no issues with the page.

我的node.js代码是:

My node.js code is:

var zombie = require("zombie"),
    sys = require('sys');

// Load the page
var browser = new zombie.Browser({ debug: false });
browser.visit('http://www.dba.dk', function (error, browser, status) {
    if (error) { console.log('Error:' + error.message); }
    if (!error && browser.statusCode == 200) {
        sys.puts(browser.html);
    }
});

并退出时出现异常TypeError:无法调用方法'toString'为null

and it exits with an exception "TypeError: Cannot call method 'toString' of null"

Jaxer实际上不是一个选项..我需要下载第三方页面并在我的服务器上执行它。如何使用Jaxer

Jaxer is not really an option.. I need to download a 3rd party page and execute it on my server. How would I do that with Jaxer

推荐答案

也许这是因为您使用的是 err.message 错误未定义? 错误,另一方面,定义的。

Perhaps that’s because you are using err.message whereas err is not defined? error, on the other hand, is defined.

更新

您是否查看了 PhantomJS

此外,它看起来像 Aptana Jaxer 可以做你想要的。引用 John Resig

Also, it looks like Aptana Jaxer could do what you want. To quote John Resig:


想象一下,扯掉Firefox的视觉
渲染部分和
替换它来代替Apache
的钩子 - 粗略地说是
Jaxer是。

Imagine ripping off the visual rendering part of Firefox and replacing it with a hook to Apache instead - roughly speaking that's what Jaxer is.

这篇关于加载网页,执行其JavaScript并将生成的HTML转储到文件中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆