python xml.etree - 删除节点但保留子节点(将子节点分配给祖父母) [英] python xml.etree - remove node but keep children (assign children to grandparents)

查看:25
本文介绍了python xml.etree - 删除节点但保留子节点(将子节点分配给祖父母)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 Python 中,如何使用 xml.etree API 删除节点但保留其子节点?

In Python, how do I remove a node but keep its children using xml.etree API?

是的,我知道有一个 使用 lxml 回答 但由于 xml.etree 是 Python 网站的一部分,我认为它也值得回答.

Yes I know there's an answer using lxml but since xml.etree is part of Python website, I figure it deserves an answer too.

原始xml文件:

<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

假设我想删除 country 节点但保留子节点并将它们分配给 country 的父节点?

Let's say I want to remove country nodes but keep the children and assign them to the parent of country?

理想情况下,我想要一个就地"做事而不是创建新树的解决方案.

Ideally, I want a solution that does things "in place" instead of creating a new tree.

我的(非工作)解决方案:

My (non-working) solution:

# Get all parents of `country`
for country_parent in root.findall(".//country/.."):
    print(country_parent.tag)
    # Some countries could have same parent so get all
    # `country` nodes of current parent
    for country in country_parent.findall("./country"):
        print('\t', country.tag)
        # For each child of `country`, assign it to parent
        # and then delete it from `parent`
        for country_child in country:
            print('\t\t', country_child.tag)
            country_parent.append(country_child)
            country.remove(country_child)
        country_parent.remove(country)
tree.write("test_mod.xml")

我的打印语句的输出:

data
     country
         rank
         gdppc
         neighbor
     country
         rank
         gdppc
     country
         rank
         gdppc
         neighbor

我们马上就可以看到有一个问题:country 缺少标签 year 和一些 neighbor 标签.

Right away we can see there's a problem: country is missing the tag year and some neighbor tags.

结果 .xml 输出:

<data>
    <rank>1</rank>
        <gdppc>141100</gdppc>
        <neighbor direction="W" name="Switzerland" />
    <rank>4</rank>
        <gdppc>59900</gdppc>
        <rank>68</rank>
        <gdppc>13600</gdppc>
        <neighbor direction="E" name="Colombia" />
    </data>

这显然是错误的.

问题:为什么会发生这种情况?

QUESTION: Why does this happen?

我可以想象这是由于附加/删除破坏了列表的某些内容,即我已经无效"了类似于迭代器的列表.

I can imagine it's from the appending/removing breaking something with the list i.e. I've "invalidated" the list similar to iterator.

推荐答案

从你的程序中删除这一行:

Remove this line from your program:

        country.remove(country_child)

xml.etree.ElementTree.Element 的迭代基本上传递到子元素的 list.在迭代期间修改该列表会产生奇怪的结果.

The iteration of an xml.etree.ElementTree.Element is essentially passed through to the list of sub-elements. Modifying that list during iteration will yield odd results.

这篇关于python xml.etree - 删除节点但保留子节点(将子节点分配给祖父母)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆