如何在python线程中使用qtwebkit? [英] How to use qtwebkit in python threads?
本文介绍了如何在python线程中使用qtwebkit?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试使用qtwebkit解析js生成的网页,我找到了一个获取页面源代码的示例:
I'm trying to parse webpages generated by js with qtwebkit, I found an example of how to get page source:
import sys
from PySide.QtGui import *
from PySide.QtCore import *
from PySide.QtWebKit import *
class Render(QWebPage):
def __init__(self, url):
self.app = QApplication(sys.argv)
QWebPage.__init__(self)
self.loadFinished.connect(self._loadFinished)
self.mainFrame().load(QUrl(url))
self.app.exec_()
def _loadFinished(self, result):
self.frame = self.mainFrame()
self.app.quit()
url = 'http://www.thesite.gov/search'
r = Render(url)
html = r.frame.toHtml()
但是我不知道如何使它在线程中工作. 那么,如何做到这一点,如果不可能的话-还有另一种快速的方法来获取由js生成的网页吗?
But i don't know how to make it work in threads. So, how to do this and if it's not possible - is there another fast way to get wepages generated by js?
推荐答案
考虑到QT的异步特性,QtWebkit方法也是非阻塞的,因此在线程中运行它们是没有意义的.您可以像这样并行启动它们:
Given QT's async nature, the QtWebkit methods are non-blocking as well, so there is no point running them in threads. You can start them parallelly like this:
from functools import partial
from PySide.QtCore import QUrl
from PySide.QtGui import QApplication
from PySide.QtWebKit import QWebView, QWebSettings
TARGET_URLS = (
'http://stackoverflow.com',
'http://github.com',
'http://bitbucket.org',
'http://news.ycombinator.com',
'http://slashdot.org',
'http://www.reddit.com',
'http://www.dzone.com',
'http://www.ideone.com',
'http://jsfiddle.net',
)
class Crawler(object):
def __init__(self, app):
self.app = app
self.results = dict()
self.browsers = dict()
def _load_finished(self, browser_id, ok):
print ok, browser_id
web_view, _flag = self.browsers[browser_id]
self.browsers[browser_id] = (web_view, True)
frame = web_view.page().mainFrame()
self.results[frame.url()] = frame.toHtml()
web_view.loadFinished.disconnect()
web_view.stop()
if all([closed for bid, closed in self.browsers.values()]):
print 'all finished'
self.app.quit()
def start(self, urls):
for browser_id, url in enumerate(urls):
web_view = QWebView()
web_view.settings().setAttribute(QWebSettings.AutoLoadImages,
False)
loaded = partial(self._load_finished, browser_id)
web_view.loadFinished.connect(loaded)
web_view.load(QUrl(url))
self.browsers[browser_id] = (web_view, False)
if __name__ == '__main__':
app = QApplication([])
crawler = Crawler(app)
crawler.start(TARGET_URLS)
app.exec_()
print 'got:', crawler.results.keys()
这篇关于如何在python线程中使用qtwebkit?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文