使用"window.onload"在phantomjs中 [英] Use "window.onload" in phantomjs

查看:101
本文介绍了使用"window.onload"在phantomjs中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过npm-phantom模块使用PhantomJS从基于AJAX的页面抓取数据.有时,当幻影开始DOM遍历时,数据尚未加载.如何在page.evaluate中插入类似window.onload = function() { ... }的内容?它给我返回了一个函数,但没有返回数据.

I'm scraping data from AJAX-based pages using PhantomJS through the npm-phantom module. Sometimes the data isn't loaded yet when phantom starts DOM traversal. How to insert something like window.onload = function() { ... } into the page.evaluate? It returns me a function, but not the data.

var phantom = require('phantom');

exports.main = function (url, callback) {
    phantom.create(function (ph) {
        ph.createPage(function (page) {
            page.open(pref + url, function (status) {
                page.evaluate(function () {
                    // here  
                    var data = {};
                    data.one = document.getElementById("first").innerText;
                    data.two = document.getElementById("last").innerText;
                    return data;
                },
                function (res) {
                    callback(null, res);
                    ph.exit();
                });
            });
        });
    });
}

在PhantomJS API页面上,我找到了 onLoadFinished ,但是它如何适用.

On the PhantomJS API page I found onLoadFinished, but how does it apply.

推荐答案

page.open(url, function(status){...})只是

page.onLoadFinished = function(status){...};
page.open(url);

您可以在此处找到报价:

另请参见 WebPage#open onLoadFinished 回调的备用钩子.

Also see WebPage#open for an alternate hook for the onLoadFinished callback.


由于这是一个基于AJAX的页面,因此您需要等待数据出现.您只能通过反复检查页面的特定部分来做到这一点.


Since this is an AJAX-based page, you need to wait for the data to appear. You can only do that by repeatedly checking a specific portion of the page.

您可以在phantomjs安装

You can find an example in the examples directory of the phantomjs installation or here. This will probably also work for phantomjs through npm-phantom.

在您的情况下,它看起来像这样(缩写):

In your case this will look like this (abbreviated):

page.open(pref + url, function (status) {
   waitFor(function check(){
       return page.evaluate(function () {
           // ensure #first and #last are in the DOM
           return !!document.getElementById("first") && 
                  !!document.getElementById("last");
       });

   }, function onReady(){
       page.evaluate(function () {
           var data = {};
           data.one = document.getElementById("first").innerText;
           data.two = document.getElementById("last").innerText;
           return data;
        });
        callback(null, res);
        ph.exit();
   }, 5000); // some timeout
});

这篇关于使用"window.onload"在phantomjs中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆