PyQt的类不工作的第二个用途 [英] PyQt Class not working for the second usage

查看:131
本文介绍了PyQt的类不工作的第二个用途的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的PyQt完全加载一个页面(包括JS),并用美丽的汤把它的内容。做工精细的第一次迭代,但之后,它崩溃。我没有大的知识Python和PyQt的中甚至更少,所以任何帮助是非常欢迎的。

I'm using PyQt to fully load a page(including JS) and get it contents using Beautiful Soup. Works fine at the first iteration, but after, it crashes. I don't have a big knowledge in Python, and even less in PyQt, so any help is very welcome.

从<借来类href=\"http://stackoverflow.com/questions/14776989/javascript-inter$p$pter-only-being-executed-on-the-first-page\">here.

from PyQt4.QtCore import QUrl, SIGNAL
from PyQt4.QtGui import QApplication
from PyQt4.QtWebKit import QWebPage

from bs4 import BeautifulSoup
from bs4.dammit import UnicodeDammit
import sys
import signal


class Render(QWebPage):
    def __init__(self, url):
        self.app = QApplication(sys.argv)
        QWebPage.__init__(self)
        self.html = None
        signal.signal(signal.SIGINT, signal.SIG_DFL)
        self.connect(self, SIGNAL('loadFinished(bool)'), self._finished_loading)
        self.mainFrame().load(QUrl(url))
        self.app.exec_()

    def _finished_loading(self, result):
        self.html = self.mainFrame().toHtml()
        self.soup = BeautifulSoup(UnicodeDammit(self.html).unicode_markup)
        self.app.quit() 

###################################################################


l = ["http://www.google.com/?q=a", "http://www.google.com/?q=b", "http://www.google.com/?q=c"]

for page in l:
    soup = Render(page).soup
    print("# soup done: " + page)

推荐答案

这个例子崩溃,因为 RenderPage 类试图创建一个新的的QApplication 和事件循环,每它会尝试加载网址。

The example crashes because the RenderPage class attempts to create a new QApplication and event-loop for every url it tries to load.

相反,只有一个的QApplication 应创建,每个页面后, QWebPage 子类应加载一个新的URL已被处理,而不是使用一个for循环

Instead, only one QApplication should be created, and the QWebPage subclass should load a new url after each page has been processed, rather than using a for-loop.

下面的例子中应该做你想要重新写:

Here's a re-write of the example which should do what you want:

import sys, signal
from bs4 import BeautifulSoup
from bs4.dammit import UnicodeDammit
from PyQt4 import QtCore, QtGui, QtWebKit

class WebPage(QtWebKit.QWebPage):
    def __init__(self):
        QtWebKit.QWebPage.__init__(self)
        self.mainFrame().loadFinished.connect(self.handleLoadFinished)

    def process(self, items):
        self._items = iter(items)
        self.fetchNext()

    def fetchNext(self):
        try:
            self._url, self._func = next(self._items)
            self.mainFrame().load(QtCore.QUrl(self._url))
        except StopIteration:
            return False
        return True

    def handleLoadFinished(self):
        self._func(self._url, self.mainFrame().toHtml())
        if not self.fetchNext():
            print('# processing complete')
            QtGui.qApp.quit()


def funcA(url, html):
    print('# processing:', url)
    # soup = BeautifulSoup(UnicodeDammit(html).unicode_markup)
    # do stuff with soup...

def funcB(url, html):
    print('# processing:', url)
    # soup = BeautifulSoup(UnicodeDammit(html).unicode_markup)
    # do stuff with soup...

if __name__ == '__main__':

    items = [
        ('http://stackoverflow.com', funcA),
        ('http://google.com', funcB),
        ]

    signal.signal(signal.SIGINT, signal.SIG_DFL)
    print('Press Ctrl+C to quit\n')
    app = QtGui.QApplication(sys.argv)
    webpage = WebPage()
    webpage.process(items)
    sys.exit(app.exec_())

这篇关于PyQt的类不工作的第二个用途的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆