PhantomJS具有多个页面的意外加载行为 [英] PhantomJS unexpected load behavior with multiple pages

查看:745
本文介绍了PhantomJS具有多个页面的意外加载行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个脚本(下面),用3个步骤的过程刮掉一个网站。它工作伟大的时候设置为最多1页。然而,当我增加到2一次,事情开始变得。。。 onFinished火灾早于我预期和页面还没有完全加载。因为这个我的脚本的其余部分。任何想法为什么这可能发生?我应该补充一点,我使用的是最新版本(1.5)。

i have a script (below) that scrapes a site with a 3 step process. it works great when set to a maximum of 1 page at a time. however, when i increase that to 2 at a time things start getting wonky. the onFinished fires earlier than i would expect and the page isn't completely loaded yet. because of this the rest of my script breaks. any idea why this might be happening? i should add that i'm using the newest version (1.5).

MAX_PAGES = 1
### 
changing MAX_PAGES to >1 causes some pages onFinished event to fire before
the page is fully rendered.  this is evident by the fact that there are >1 images
for some pages.  i havent been able to reproduce using microsoft.com, but on some
pages i was working on the first onLoadFinished seemed to be called before the page
was actually fully loaded based on the look of the rendered images
###

newPage = (id) ->
context = {}
context.id = id
context.step = 0
context.page = require('webpage').create()
context.page.onLoadStarted = ->
    context.step++
context.page.onLoadFinished = (status) ->
    console.log status
    if status is 'success'
        context.page.render("#{context.id}_#{context.step}.png")
    else
        context.page.release()
        context.page.open('http://www.microsoft.com')
        console.log 'started loading'

newPage id for id in [1..MAX_PAGES]


推荐答案

我认为这个问题与PhantomJS中的每个网页使用相同的QNetworkAccessManager的事实有关,因此, finished()信号在每个网页对象完成加载时触发。可能需要修改PhantomJS的代码,以解决这个问题。我试图在PhantomJS中并行加载多个页面时,我注意到了这一点。我正在使用的应用程序使用QtWebkit并同时加载多个页面,所以我必须确保每个网页都有自己的QNetworkAccessManager,以便完成的()信号不会相互干扰。

I think the problem has to do with the fact that each webpage within PhantomJS is using the same QNetworkAccessManager, thus, the finished() signal is firing when each webpage object finishes loading. Modifications to PhantomJS's code might need to be made in order to fix this problem. I have noticed this before when trying to load multiple pages in parallel in PhantomJS. An application I'm working on uses QtWebkit and loads multiple pages simultaneously so I have to make sure that each webpage gets its own QNetworkAccessManager so that the finished() signals don't interfere with each other.

这篇关于PhantomJS具有多个页面的意外加载行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆