当我观察libxml2 xmlNodePtr的类型时,它会发生变化 [英] libxml2 xmlNodePtr's type changes when I observe it

查看:128
本文介绍了当我观察libxml2 xmlNodePtr的类型时,它会发生变化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

昨天我花了一些时间调试一个可爱的Heisenbug,其中xmlNodePtr的类型会改变.此示例显示错误:

#include <iostream>

#include <vector>
#include <string>
#include <memory>    // std::unique_ptr

#include <cstdint>

#include <libxml/tree.h>
#include <libxml/parser.h>

struct SomeDataType {
    std::vector<std::vector<std::string>> data;

    explicit SomeDataType(uint32_t rows_, uint32_t columns_)
        : data(rows_)
    {
        for (uint32_t row = 0; row < rows_; ++row) {
            data[row].resize(columns_);
        }
    }
};

static std::vector<xmlNodePtr> GetChildren(xmlNodePtr node)
{
    std::vector<xmlNodePtr> children;

    xmlNodePtr child = node->children;
    while (child) {
        if (child->type == XML_ELEMENT_NODE) {
            children.push_back(child);
        }
        child = child->next;
    }

    return children;
}

int main() {
    std::unique_ptr<xmlDoc, void(*)(xmlDoc*)> document = { xmlParseEntity("libxml2-fail.xml"), xmlFreeDoc };

    SomeDataType{ 3, 2 };

    xmlNodePtr root = xmlDocGetRootElement(document.get());

    for (const xmlNodePtr &child : GetChildren(root)) {
        const xmlNodePtr &entry = GetChildren(child)[0]; // Problem here...
        std::cout << "Expected " << XML_ELEMENT_NODE << " but was " << entry->type << std::endl;
        std::cout << entry->name << std::endl;
    }
}

编译:

 g++ -g -std=c++14 -Wall -Wextra -pedantic -I/usr/include/libxml2 libxml2-fail.cpp -lxml2 -o fail.out
 

xml文件:

 <?xml version="1.0" encoding="utf-8"?>
<data>
  <tag>
    <subtag>1</subtag>
  </tag>
</data>
 

运行为我提供以下输出:

 Expected 1 but was 17
 

逐步使用gdb,直到到达const xmlNodePtr & = ...行,一切都很好.它不是类型XML_ELEMENT_NODE,而是类型为XML_ENTITY_DECL.但是,如果我运行以下命令,则引用xmlNodePtr会变成我期望的类型:

 48          const xmlNodePtr &entry = GetChildren(child)[0];
(gdb) n
49          std::cout << "Expected " << XML_ELEMENT_NODE << " but was " << entry->type << std::endl;
(gdb) p *entry
$1 = {_private = 0x0, type = XML_ENTITY_DECL, name = 0x0, children = 0xb7e67d7c <std::string::_Rep::_S_empty_rep_storage+12>, last = 0x0, parent = 0x69, next = 0x0, prev = 0x9, doc = 0x0, ns = 0x805edb8, content = 0x805edb8 "", properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 60648, extra = 2053}
(gdb) p *child
$2 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0x805ee98 "tag", children = 0x805eea8, last = 0x805ef98, parent = 0x805edb8, next = 0x805efe8, prev = 0x805ee08, doc = 0x805ece8, ns = 0x0, content = 0x0, properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 3, extra = 0}
(gdb) p GetChildren(child)
$3 = std::vector of length 1, capacity 1 = {0x805eef8}
(gdb) p *entry
$4 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0x805ef38 "subtag", children = 0x805ef48, last = 0x805ef48, parent = 0x805ee58, next = 0x805ef98, prev = 0x805eea8, doc = 0x805ece8, ns = 0x0, content = 0x0, properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 4, extra = 0}
(gdb) 
 

当我像这样遍历一个元素时,我没有问题:

for (const xmlNodePtr &entry : GetChildren(child)) {
    ...
}

当我不像这样将xmlNodePtr用作const引用时,我也没有问题:

xmlNodePtr entry = GetChildren(child)[0];

但是,根据这个stackoverflow问题,这应该不是问题.

SomeDataType结构非常必要;否则我会遇到段错误,因为entry变为空指针.

此错误来自何处?

解决方案

执行此操作时:

const xmlNodePtr &entry = GetChildren(child)[0]; // Problem here...

您正在以一种不会延长生命周期的方式有效地将引用绑定到临时目录. operator[]返回一个引用,因此您没有将引用绑定到临时对象,而是将引用绑定到了引用.但是从operator[]返回的引用引用的是GetChildren()返回的基础 temporary vector中的元素,该元素超出了行尾的范围,留下了一个悬挂的引用. /p>


但是,当您尝试尝试时:

for (const xmlNodePtr &entry : GetChildren(child)) {

是以下语法糖:

{
    auto&& __range = GetChildren(child); // bind temporary to reference
                                         // lifetime IS extended
    auto b = begin(__range);
    auto e = end(__range);
    for (; b != e; ++b) {
        const xmlNodePtr& entry = *b;
        // ...
    }
}

在这里,*b不是临时的,也不是临时的任何部分,它是对容器的引用,该容器的生存期与__range一样长,贯穿整个循环.无悬空参考.


类似地,

xmlNodePtr entry = GetChildren(child)[0];

只是复制,没有任何参考问题.

I spent some time yesterday debugging a lovely Heisenbug where an xmlNodePtr's type would change on me. This example shows the error:

#include <iostream>

#include <vector>
#include <string>
#include <memory>    // std::unique_ptr

#include <cstdint>

#include <libxml/tree.h>
#include <libxml/parser.h>

struct SomeDataType {
    std::vector<std::vector<std::string>> data;

    explicit SomeDataType(uint32_t rows_, uint32_t columns_)
        : data(rows_)
    {
        for (uint32_t row = 0; row < rows_; ++row) {
            data[row].resize(columns_);
        }
    }
};

static std::vector<xmlNodePtr> GetChildren(xmlNodePtr node)
{
    std::vector<xmlNodePtr> children;

    xmlNodePtr child = node->children;
    while (child) {
        if (child->type == XML_ELEMENT_NODE) {
            children.push_back(child);
        }
        child = child->next;
    }

    return children;
}

int main() {
    std::unique_ptr<xmlDoc, void(*)(xmlDoc*)> document = { xmlParseEntity("libxml2-fail.xml"), xmlFreeDoc };

    SomeDataType{ 3, 2 };

    xmlNodePtr root = xmlDocGetRootElement(document.get());

    for (const xmlNodePtr &child : GetChildren(root)) {
        const xmlNodePtr &entry = GetChildren(child)[0]; // Problem here...
        std::cout << "Expected " << XML_ELEMENT_NODE << " but was " << entry->type << std::endl;
        std::cout << entry->name << std::endl;
    }
}

Compiled with:

g++ -g -std=c++14 -Wall -Wextra -pedantic -I/usr/include/libxml2 libxml2-fail.cpp -lxml2 -o fail.out

The xml file:

<?xml version="1.0" encoding="utf-8"?>
<data>
  <tag>
    <subtag>1</subtag>
  </tag>
</data>

Running gives me the following output:

Expected 1 but was 17

Stepping through with gdb, everything is fine until we reach the line const xmlNodePtr & = .... Instead of having type XML_ELEMENT_NODE, it has type XML_ENTITY_DECL. However, if I run the following commands, the reference xmlNodePtr morphs into the type I expect:

48          const xmlNodePtr &entry = GetChildren(child)[0];
(gdb) n
49          std::cout << "Expected " << XML_ELEMENT_NODE << " but was " << entry->type << std::endl;
(gdb) p *entry
$1 = {_private = 0x0, type = XML_ENTITY_DECL, name = 0x0, children = 0xb7e67d7c <std::string::_Rep::_S_empty_rep_storage+12>, last = 0x0, parent = 0x69, next = 0x0, prev = 0x9, doc = 0x0, ns = 0x805edb8, content = 0x805edb8 "", properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 60648, extra = 2053}
(gdb) p *child
$2 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0x805ee98 "tag", children = 0x805eea8, last = 0x805ef98, parent = 0x805edb8, next = 0x805efe8, prev = 0x805ee08, doc = 0x805ece8, ns = 0x0, content = 0x0, properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 3, extra = 0}
(gdb) p GetChildren(child)
$3 = std::vector of length 1, capacity 1 = {0x805eef8}
(gdb) p *entry
$4 = {_private = 0x0, type = XML_ELEMENT_NODE, name = 0x805ef38 "subtag", children = 0x805ef48, last = 0x805ef48, parent = 0x805ee58, next = 0x805ef98, prev = 0x805eea8, doc = 0x805ece8, ns = 0x0, content = 0x0, properties = 0x0, nsDef = 0x0, psvi = 0x0, line = 4, extra = 0}
(gdb) 

I don't have the problem when I instead loop over the one element like so:

for (const xmlNodePtr &entry : GetChildren(child)) {
    ...
}

I also don't have the problem when I don't make the xmlNodePtr a const reference like so:

xmlNodePtr entry = GetChildren(child)[0];

However, according to this stackoverflow question, it shouldn't be a problem.

The SomeDataType struct is strangely necessary; otherwise I get a segfault because entry becomes a null pointer.

What is this bug coming from?

解决方案

When you do this:

const xmlNodePtr &entry = GetChildren(child)[0]; // Problem here...

You're effectively binding a reference to a temporary in a way that is not lifetime extended. operator[] returns a reference, so you're not binding a reference to a temporary - you're binding a reference to a reference. But that returned reference from operator[] refers to an element in the underlying temporary vector returned by GetChildren() which goes out of scope at the end of the line, leaving yourself a dangling reference.


However, when you instead tried:

for (const xmlNodePtr &entry : GetChildren(child)) {

that is syntactic sugar for:

{
    auto&& __range = GetChildren(child); // bind temporary to reference
                                         // lifetime IS extended
    auto b = begin(__range);
    auto e = end(__range);
    for (; b != e; ++b) {
        const xmlNodePtr& entry = *b;
        // ...
    }
}

here, *b isn't a temporary or any part of a temporary - it's a reference into a container whose lifetime lasts as long as __range does, which is through the entire body of the loop. No dangling reference.


Similarly,

xmlNodePtr entry = GetChildren(child)[0];

is just copying, no reference issues whatsoever.

这篇关于当我观察libxml2 xmlNodePtr的类型时,它会发生变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆