使用phantomjs获取JavaScript呈现的html源代码 [英] Get javascript rendered html source using phantomjs

查看:1438
本文介绍了使用phantomjs获取JavaScript呈现的html源代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我没有在开发或测试环境中寻求任何帮助。我也是phantomjs的新手,我想要的仅仅是linux终端上的phantomjs的命令行操作。

我有一个html页面,其正文由一些javascript代码呈现。我需要的是我想用phantomjs下载呈现的html内容。



我没有任何想法使用phantomjs。我有一些shell脚本的经验。所以我试图用 curl 来做到这一点。但是,由于curl不足以呈现JavaScript,我只能获取默认源代码的html。呈现的内容未被下载。我听说红宝石机械化可能会完成这项工作。但是我对ruby没有任何了解。因此,在进一步调查中,我发现了命令行工具 phantomjs 。我该如何使用 phantomjs



来做到这一点。请随时询问我需要提供哪些附加信息。

解决方案

不幸的是,使用PhantomJS命令行是不可能的。您必须使用Javascript文件才能真正完成PhantomJS的任何功能。



以下是您可以使用的非常简单的脚本版本



代码大部分来自 https://stackoverflow.com/a/12469284/4499924



printSource.js $ b

  var system = require('system'); 
var page = require('webpage')。create();
// system.args [0]是文件名,所以system.args [1]是第一个真正的参数
var url = system.args [1];
//呈现页面并运行回调函数
page.open(url,function(){
// page.content是源
console.log(page .content);
//需要调用phantom.exit()以防止挂起
phantom.exit();
});

将页面源打印为标准输出。



phantomjs printSource.js http://todomvc.com/examples/emberjs/



将页面源保存到文件中

phantomjs printSource.js http:// todomvc。 com / examples / emberjs /> ember.html


First of all, I am not looking for any help in development or testing environment. Also I am new to phantomjs and all I want is just the command line operation of phantomjs on linux terminal.

I have an html page whose body is rendered by some javascript code. What I need is I wanted to download that rendered html content using phantomjs.

I don't have any idea using phantomjs. I have a bit of experience in shell scripting. So I have tried to do this with curl. But as curl is not sufficient to render javascript, I was able to get the html of the default source code only. The rendered contents weren't downloaded. I heard that ruby mechanize may do this job. But I have no knowledge about ruby. So on further investigation I found the command line tool phantomjs. How can I do this with phantomjs?

Please feel free to ask what all additional information do I need to provide.

解决方案

Unfortunately, that is not possible using just the PhantomJS command line. You have to use a Javascript file to actually accomplish anything with PhantomJS.

Here is a very simple version of the script you can use

Code mostly copied from https://stackoverflow.com/a/12469284/4499924

printSource.js

var system = require('system');
var page   = require('webpage').create();
// system.args[0] is the filename, so system.args[1] is the first real argument
var url    = system.args[1];
// render the page, and run the callback function
page.open(url, function () {
  // page.content is the source
  console.log(page.content);
  // need to call phantom.exit() to prevent from hanging
  phantom.exit();
});

To print the page source to standard out.

phantomjs printSource.js http://todomvc.com/examples/emberjs/

To save the page source in a file

phantomjs printSource.js http://todomvc.com/examples/emberjs/ > ember.html

这篇关于使用phantomjs获取JavaScript呈现的html源代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆