PhantomJS节点-page.open-无法跟踪多个页面 [英] PhantomJS Node - page.open - cannot keep track of multiple pages

查看:72
本文介绍了PhantomJS节点-page.open-无法跟踪多个页面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用幻影节点将节点与PhantomJS连接.我正在尝试并行打开页面,但问题是page.open回调函数不会将对页面的引用传递回去,因此我无法知道哪个页面已完成.

I'm using Phantom Node to interface node with PhantomJS. I'm trying to open pages in parallel, but the issue is that page.open callback function does not pass back the reference to the page, so I don't have a way to know which page has completed.

相关代码

self.queue[j].page.open.call( self.queue[j].page, rows[i].url, function( status )
{
   console.log( this ) // <-- returns undefined
   // So how do I keep track of which pages have finished loading?
   // The only variable I have available here is `status`
});

全功能代码:

SnapEngine.prototype.processSnaps = function( rows, type )
{
var self = this;

if ( ! rows || rows.length === 0 ) return true;

for( var i = 0; i < rows.length; i++ )
{
    // If queue is full, stop processing and wait for next snap engine iteration
    if ( self.getAvailableSizeInQueue() <= 0 )
    {
        self.logger.info( 'Queue is full for signature snap processing' );
        return true;
    }

    // Snapshots are processed by url, if multiple duplication urls are requested, all are updated after one of them is complete
    // So if a url is already being processed, don't reprocess it
    if ( self.findUrlInQueue( rows[i].url ) !== false )
    {
        self.logger.info( 'URL already being processed', url );
        continue;
    }

    for( j = 0; j < self.queue.length; j++ )
    {
        // Find an unused page object
        if ( self.queue[j] && self.queue[j].hasOwnProperty( 'page' ) && ( ! self.queue[j].page.url || self.queue[j].page.url == '' ))
        {
            self.logger.info( 'Opening URL in browser', rows[i].url );

            // Start loading page
            self.queue[j].page.open.call( self.queue[j].page, rows[i].url, function( status )
            {
                // ===== ISSUE HERE =====
                var url = this.url; // <-- this is undefined
                // ======================

                self.resetPage( self.queue[ index ]);

                if ( status === 'success' )
                {
                    self.updateStatus( url, 'ready' );
                }
                else
                {
                    self.updateStatus( url, 'failed' );
                }

                self.removeUrlFromQueue( url )
            });

            self.updateStatus( rows[i].url, 'processing' );
            break;
        }
    }
}
}

推荐答案

尝试如下:

我添加了一个直接在打开页面的部分周围执行的函数,从而引入了新的作用域.因此url不会被破坏(您不能使用rows[i].url,因为我会在调用回调之前更改),并且可以在您的回调中使用.

I added a function that is directly executed around the part that opens the page, thus introducing a new scope. Therefore url won't get mangled (you cannot use rows[i].url as i will change before your callback is called) and will be available in your callback.

for( j = 0; j < self.queue.length; j++ )
{
    // Find an unused page object
    if ( self.queue[j] && self.queue[j].hasOwnProperty( 'page' ) && ( ! self.queue[j].page.url || self.queue[j].page.url == '' ))
    {
        self.logger.info( 'Opening URL in browser', rows[i].url );
        (function() {
            var url = rows[i].url;
            // Start loading page
            self.queue[j].page.open.call( self.queue[j].page, url, function( status )
            {                   
                self.resetPage( self.queue[ index ]);

                if ( status === 'success' )
                {
                    self.updateStatus( url, 'ready' );
                }
                else
                {
                    self.updateStatus( url, 'failed' );
                }

                self.removeUrlFromQueue( url )
            });
        })();

        self.updateStatus( rows[i].url, 'processing' );
        break;
    }
}

这篇关于PhantomJS节点-page.open-无法跟踪多个页面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆