使用C ++解析HTML（最好使用Qt） [英] Parsing HTML with C++ (using Qt preferably)

查看：1913 发布时间：2016/10/25 14:56:14 c++ html qt parsing qwebkit

本文介绍了使用C ++解析HTML（最好使用Qt）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图用C ++解析一些HTML以从HTML中提取所有网址（网址可以在href和src属性中）。

I'm trying to parse some HTML with C++ to extract all urls from the HTML (the urls can be inside the href and src attributes).

使用Webkit为我做繁重的工作，但由于某些原因，当我加载一个框架的HTML生成的文档都是错误的（如果我让Webkit从Web获取页面生成的文档是很好，但Webkit也下载所有图像，样式和脚本，我不想那样）

I tried to use Webkit to do the heavy work for me but for some reason when I load a frame with HTML the generated document is all wrong (if I make Webkit get the page from the web the generated document is just fine but Webkit also downloads all images, styles, and scripts and I don't want that)

这是我试图做的：

frame->setHtml(HTML);
QWebElement document = frame->documentElement();
QList<QWebElement> imgs = document.findAll("a"); // Doesn't find all links
QList<QWebElement> imgs = document.findAll("img"); // Doesn't find all images
QList<QWebElement> imgs = document.findAll("script");// Doesn't find all scripts
qDebug() << document.toInnerXml(); // Print a completely messed-up document with several missing elements

我做错了什么？有一个简单的方法来解析HTML与Qt吗？（或其他一些轻量级的库）

What am I doing wrong? Is there an easy way to parse HTML with Qt? (Or some other lightweight library)

推荐答案

你可以使用XPath表达式让你的解析生活更容易， a href =http://doc.trolltech.com/4.5/qxmlquery.html#running-xpath-expressions =nofollow> this 。

You can always use XPath expressions to make your parsing life easier, take a look at this for instance.

或者你可以这样做

QWebView* view = new QWebView(parent);
view.load(QUrl("http://www.your_site.com"));
QWebElementCollection elements = view.page().mainFrame().findAllElements("a");

这篇关于使用C ++解析HTML（最好使用Qt）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用C ++解析HTML（最好使用Qt） [英] Parsing HTML with C++ (using Qt preferably)

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

使用C ++解析HTML（最好使用Qt） [英] Parsing HTML with C++ (using Qt preferably)

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭