为所请求的网页接收多个loadFinished信号 [英] Receiving multiple loadFinished signals for a requested web page

查看:381
本文介绍了为所请求的网页接收多个loadFinished信号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试加载 QWebPage 时收到多个 loadFinished 信号,导致问题。还有一些其他问题似乎暗示了同样的问题,但解决方案对我没有效果:




  • QtWebPage - loadFinished()多次调用

  • < a href =http://stackoverflow.com/questions/8415289/signal-qwebpageloadfinishedbool-returns-twice>信号QWebPage :: loadFinished(bool)会返回两次?



    • 在第一个问题,答案是连接信号到插槽只有一次,但我已经这样做了。第二个问题的答案表明,我应该连接到框架 loadFinished 信号,但我完全没有得到必要的数据。



      多个页面:

        int main(int argc,char * argv [])
      {
      QApplication (argc,argv);

      QList< QUrl> urls;
      urls.append(QUrl(http://www.useragentstring.com/pages/Chrome/));
      urls.append(QUrl(http://www.useragentstring.com/pages/Firefox/));
      urls.append(QUrl(http://www.useragentstring.com/pages/Opera/));
      urls.append(QUrl(http://www.useragentstring.com/pages/Internet Explorer /));
      urls.append(QUrl(http://www.useragentstring.com/pages/Safari/));

      foreach(QUrl url,urls)
      {
      UA * ua = new UA();
      QWebPage * page = new QWebPage();
      // QObject :: connect(page,SIGNAL(loadFinished(bool)),ua,SLOT(pageLoadFinished(bool)));
      QObject :: connect(page-> mainFrame(),SIGNAL(loadFinished(bool)),ua,SLOT(frameLoadFinished(bool)));
      //加载页面
      page-> mainFrame() - > load(url);
      }

      return app.exec();
      }

      处理信号的类如下所示:

        class UA:public QObject 
      {
      Q_OBJECT
      private:
      int _numPageLoadSignals;
      int _numFrameLoadSignals
      public:
      UA()
      {
      _numPageLoadSignals = 0;
      _numFrameLoadSignals = 0;
      }
      〜UA(){}
      public slots:
      void pageLoadFinished(bool ok)
      {
      _numPageLoadSignals ++;

      QWebPage * page = qobject_cast< QWebPage *>(sender());
      if(ok&& page)
      {
      qDebug()< _numPageLoadSignals<< loads
      << page-> mainFrame() - > documentElement()。findAll(div#liste ul li a)count()
      < elements found on:< page-> mainFrame() - > requestedUrl()。toString();
      }
      }

      void frameLoadFinished(bool ok)
      {
      _numFrameLoadSignals ++;
      QWebFrame * frame = qobject_cast< QWebFrame *>(sender());
      if(ok&& frame)
      {
      qDebug()< _numFrameLoadSignals<< loads
      << frame-> documentElement()。findAll(div#liste ul li a)。count()
      < elements found on:< frame-> requestedUrl()。toString();
      }
      }
      };

      这里是只连接到框架的 loadFinished signal:

        1载入0个元素:http://www.useragentstring.com/pages/Safari /
      1加载0个元素:http://www.useragentstring.com/pages/Chrome/
      1加载0个元素:http://www.useragentstring.com / pages / Opera /
      1加载0个元素:http://www.useragentstring.com/pages/Firefox/
      1加载241个元素,位于:http:// www .useragentstring.com / pages / Internet Explorer /

      这是我连接到页面的结果 loadFinished 信号:

        1加载0个元素: //www.useragentstring.com/pages/Safari/
      1 load 0 elements found on:http://www.useragentstring.com/pages/Chrome/
      1载入0个元素:http://www.useragentstring.com/pages/Firefox/
      1加载0个元素:http://www.useragentstring.com/pages/Internet Explorer /
      2加载576个元素:http://www.useragentstring.com/pages/Safari/
      2加载782个元素:http://www.useragentstring.com/pages/Chrome/
      2加载找到的241个元素:http://www.useragentstring.com/pages/Internet Explorer /
      2加载1946个元素:http://www.useragentstring.com/pages / Firefox /
      3加载找到的241个元素:http://www.useragentstring.com/pages/Internet Explorer /
      3加载1946个元素:http:// www。 userbentstring.com/pages/Firefox/
      3加载782个元素:http://www.useragentstring.com/pages/Chrome/
      1加载964个元素:http: //www.useragentstring.com/pages/Opera/
      3载入576个元素:http://www.useragentstring.com/pages/Safari/

      我不明白这种行为,为什么有时我得到相关内容,而其他时候我不知道。如果我连接到页面的 loadFinished 信号,那么我将最终获得内容,但我不知道它实际上会发生什么。 我如何知道我的网页实际完成加载的时间?



      更新



      我假设我的大部分内容将在不到3秒内到达,所以我想出了一个解决方法:我设置一个定时器事件,以信号 UA :: loadFinished QWebPage 接收到第一个 loadFinished 信号之后,

      解决方案

      引用QWebPage文档:


      最后,当页面内容完全加载时,发出loadFinished()信号,与脚本执行或页面呈现无关。


      catch是最后一个短语。所以一些人在下面的线程指向我相信的问题。



      为什么QWebView.loadFinished在一些网站上调用了几次例如youtube?



      我一直在努力编写一个爬虫,它涉及使用javascript在后台加载内容的页面。多个loadFinished是一个问题(我希望它在一切稳定后触发。),但我注意到,基本的问题是,即使在最后一个loadFinished激活一个插槽后,网页内容仍然不能被渲染/准备。



      所以我试验了QWebPage类的许多信号,看看它们是否在loadFinished信号后一致触发。



      找到one:repaintRequested(QRect)



      我不知道这是否一直工作。但是,如果任何内容影响网页的外观,我相信这个信号必须调用的页面被假设完成。我既不显示页面,也不使用视图窗口小部件,但信号持续触发。只有问题是它被触发多次。 (比loadFinished更经常),因此您需要检查mainFrame-> requestedUrl()是否与mainFrame-> url()相同,并且您感兴趣的内容的关键字存在。 (特别是如果你像我一样重用webPage,后来的请求改变了requestedUrl,而来自上一个加载的mainFrame内容仍然存在。)



      A如果要检查信号的数量,可能只有在从QWebPage接收到一个loadFinished信号后才能连接repaintRequested(并且可能检查额外条件)。



      寻址无限嵌套加载,因为不知道任何信号是否是最后一个,但是如果你正在搜索一个内容,那么在加载特定内容之后,信号必然被触发(我的意思是集成到DOM中) / p>

      I'm receiving multiple loadFinished signals when I attempt to load a QWebPage and I'm not sure what's causing the issue. There were a couple of other questions that seemed to allude to the same problem, but the solutions didn't work for me:

      In the first question, the answer was to connect signals to slots only once," but I already do that. The answer to the second question suggests that I should connect to the frame's loadFinished signal, but I simply don't get the necessary data when that is done.

      I attempt to load multiple pages:

      int main(int argc, char *argv[])
      {
          QApplication app(argc, argv);    
      
          QList<QUrl> urls;
          urls.append(QUrl("http://www.useragentstring.com/pages/Chrome/"));
          urls.append(QUrl("http://www.useragentstring.com/pages/Firefox/"));
          urls.append(QUrl("http://www.useragentstring.com/pages/Opera/"));
          urls.append(QUrl("http://www.useragentstring.com/pages/Internet Explorer/"));
          urls.append(QUrl("http://www.useragentstring.com/pages/Safari/"));
      
          foreach(QUrl url, urls)
          {
              UA* ua = new UA();
              QWebPage* page = new QWebPage();
              //QObject::connect(page, SIGNAL(loadFinished(bool)), ua, SLOT(pageLoadFinished(bool)));
              QObject::connect(page->mainFrame(), SIGNAL(loadFinished(bool)), ua, SLOT(frameLoadFinished(bool)));
              // Load the page
              page->mainFrame()->load(url);
          }
      
          return app.exec();
      }
      

      The class that processes the signals looks like this:

      class UA:public QObject
      {
          Q_OBJECT
      private:
          int _numPageLoadSignals;
          int _numFrameLoadSignals
      public:
          UA()
          {
              _numPageLoadSignals = 0;
              _numFrameLoadSignals = 0;
          }
          ~UA(){}
      public slots:
          void pageLoadFinished(bool ok)
          {
              _numPageLoadSignals++;
      
              QWebPage * page = qobject_cast<QWebPage *>(sender());
              if(ok && page)
              {    
                  qDebug() << _numPageLoadSignals << " loads " 
                      << page->mainFrame()->documentElement().findAll("div#liste ul li a").count()
                      << " elements found on: " << page->mainFrame()->requestedUrl().toString();
              }
          }
      
          void frameLoadFinished(bool ok)
          {
              _numFrameLoadSignals++;
              QWebFrame * frame = qobject_cast<QWebFrame *>(sender());
              if(ok && frame)
              {
                  qDebug() << _numFrameLoadSignals << " loads " 
                      <<  frame->documentElement().findAll("div#liste ul li a").count()
                      << " elements found on: " << frame->requestedUrl().toString();
              }
          }
      };
      

      Here is the result of only connecting to the frame's loadFinished signal:

      1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Safari/"
      1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Chrome/"
      1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Opera/"
      1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Firefox/"
      1  loads  241  elements found on:  "http://www.useragentstring.com/pages/Internet Explorer/"
      

      Here are the results when I connect to the page's loadFinished signal:

      1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Safari/"
      1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Chrome/"
      1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Firefox/"
      1  loads  0  elements found on:  "http://www.useragentstring.com/pages/Internet Explorer/"
      2  loads  576  elements found on:  "http://www.useragentstring.com/pages/Safari/"
      2  loads  782  elements found on:  "http://www.useragentstring.com/pages/Chrome/"
      2  loads  241  elements found on:  "http://www.useragentstring.com/pages/Internet Explorer/"
      2  loads  1946  elements found on:  "http://www.useragentstring.com/pages/Firefox/"
      3  loads  241  elements found on:  "http://www.useragentstring.com/pages/Internet Explorer/"
      3  loads  1946  elements found on:  "http://www.useragentstring.com/pages/Firefox/"
      3  loads  782  elements found on:  "http://www.useragentstring.com/pages/Chrome/"
      1  loads  964  elements found on:  "http://www.useragentstring.com/pages/Opera/"
      3  loads  576  elements found on:  "http://www.useragentstring.com/pages/Safari/"
      

      I don't understand the behavior, why sometimes I get relevant content and other times I don't. If I connect to the page's loadFinished signal, then I will eventually get the content but I don't know when it will actually happen. How do I know when my page has actually finished loading?

      Update

      I'm assuming that most of my content will arrive in less than 3 seconds, so I've come up with a workaround: I set a timer event to signal the UA::loadFinished 3 seconds after the first loadFinished signal is received from the QWebPage. That's not very pretty, nor is it efficient, but it works for this situation.

      解决方案

      Quoting QWebPage documentation:

      Finally, the loadFinished() signal is emitted when the page contents are loaded completely, independent of script execution or page rendering.

      The catch is that last phrase. So some people in the following thread point towards the problem I believe.

      Why is QWebView.loadFinished called several times on some sites e.g. youtube?

      I have been struggling to code a crawler which involves pages that load content using javascript behind the scenes. Multiple loadFinished is a problem (I wish it triggered after everything is settled down.), but I noticed that the essential problem is that the webpage content may still not be rendered/prepared even after the last loadFinished activates a slot.

      So I experimented with many signals of the QWebPage class to see if any of them is consistently triggered after loadFinished signal.

      Found one: repaintRequested(QRect)

      I don't know if this works all the time. But if any content affects the look of a web page, I believe this signal has to be called for the page to be assumed complete. I am neither displaying the pages, nor using a view widget, but the signal is consistently triggered. Only problem is it is triggered many times. (Much more often than loadFinished), therefore you need to check if the mainFrame->requestedUrl() is the same as mainFrame->url(), AND a keyword of the content you are interested in exists. (Especially if you are reusing the webPage like me. A subsequent request changes the requestedUrl, while the mainFrame content from a previous load is still there. Some persistence there)

      A trick to cut the number of signals to check might be to connect repaintRequested only after receiving a loadFinished signal from the QWebPage(and possibly checking for extra conditions).

      This may not address the infinite nested loads, since one does not know if any signal is the last, but if you are searching for a content then a signal is bound to be triggered after that specific content is loaded(I mean integrated into the DOM :)

      这篇关于为所请求的网页接收多个loadFinished信号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆