TinyXML2从节点和所有子节点获取文本 [英] TinyXML2 get text from node and all subnodes

查看:825
本文介绍了TinyXML2从节点和所有子节点获取文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何从TinyXML2的节点和子节点获取文本?

How does one go about getting the text from the nodes and subnodes in TinyXML2?

XMLPrinter类似乎可以满足我的需要,但是它不能正确打印文本.

The XMLPrinter class seems to do what I need, but it does not print the text properly.

我的XML:

<div>The quick brown <b>fox</b> jumps over the <i>lazy</i> dog.</div>

我的类扩展了XMLPrinter类:

My class which extends the XMLPrinter class:

class XMLTextPrinter : public XMLPrinter {
    virtual bool    VisitEnter (const XMLDocument &) { return true; }
    virtual bool    VisitExit (const XMLDocument &)  { return true; }
    virtual bool    VisitEnter (const XMLElement &e, const XMLAttribute *)  {
        auto text = e.GetText();
        if(text) {
            std::cout << text;
        }
        return true;
    }
    virtual bool    VisitExit (const XMLElement &e)  { return true; }
    virtual bool    Visit (const XMLDeclaration &)  { return true; }
    virtual bool    Visit (const XMLText &e) { return true; }
    virtual bool    Visit (const XMLComment &)  { return true; }
    virtual bool    Visit (const XMLUnknown &)  { return true; }
};

我的代码:

XMLDocument document;
document.Parse(..., ...);

auto elem = ...;

XMLTextPrinter printer;
elem->Accept(&printer);

输出:

The quick brown foxlazy

为什么忽略所有在<b><i>元素之后的文本?我该如何解决?另外,XMLPrinter类可以正确地将其与标签一起输出,但我不希望标签.

Why is it ignoring all text which come after the <b> and <i> elements? How can I solve this? Also, the XMLPrinter class properly prints it out with the tags, but I do not want the tags.

推荐答案

[我希望于17年4月14日编辑以进行改进.]

XMLPrinterXMLVisitor派生,并完整打印XML文档(或元素),标签,属性以及所有内容. XMLVisitor执行上下递归XML层次结构的工作,调用default,不执行任何操作,对可能具有后代(子代)的节点(例如,文档和元素以及访问")的方法实现VisitEnter/VisitExit.用于叶节点,例如文本,注释等.在派生类中重写这些方法以实现所需的功能.

XMLPrinter derives from XMLVisitor and prints the XML document (or element) in full, tags, attributes and all. XMLVisitor does the work of recursing up and down the XML hierarchy, calling default, do nothing, implementations of methods VisitEnter/VisitExit for nodes that can have descendants (children), i.e. documents and elements and ``Visit` for leaf nodes, i.e. text, comments etc. Override these methods in a derived class to implement the desired functionality.

第一个问题是您要修改XMLPrinter.这是从XMLVisitor派生的,并创建XML文档的可打印表示形式.但是随后您用自己的方法替换了所有XMLPrinter visit ... 方法.直接从XMLVisitor派生会更好,而且工作更少.

The first problem is that you are modifying XMLPrinter. This derives from XMLVisitor and creates a printable representation of the XML document. But then you replace all XMLPrinter's visit... methods with your own. It would be much better, and less work, to derive from XMLVisitor directly.

其次,您仅使用GetText()VisitEnter中获取元素文本,当子节点嵌入其中时,该文本将不起作用

Secondly, you're getting the element text from VisitEnter alone using GetText() which will not work when child nodes are embedded in it as documented here.

在这种情况下,要仅获取所有元素的文本,请覆盖文本叶子节点的Visit,即Visit(const XMLText &).

In this case, to get only the text of all elements override Visit for the text leaf nodes, i.e. Visit(const XMLText &).

#include "tinyxml2.h"
#include <iostream>

using namespace tinyxml2;

class XMLPrintText : public XMLVisitor
{
public:
   virtual bool Visit (const XMLText & txt) override
   {
      std::cout << txt .Value();
      return true;
   }
};

int main()
{
   XMLDocument doc;
   doc.Parse ("<div>The quick brown <b>fox</b> jumps over the <i>lazy</i> dog.</div>");
   auto div = doc .FirstChildElement();
   XMLPrintText prt;
   div -> Accept (&prt);
   return 0;
}

这篇关于TinyXML2从节点和所有子节点获取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆