python xml.etree - 删除节点但保留子节点(将子节点分配给祖父母) [英] python xml.etree - remove node but keep children (assign children to grandparents)
问题描述
在 Python 中,如何使用 xml.etree
API 删除节点但保留其子节点?
In Python, how do I remove a node but keep its children using xml.etree
API?
是的,我知道有一个 使用 lxml 回答 但由于 xml.etree
是 Python 网站的一部分,我认为它也值得回答.
Yes I know there's an answer using lxml but since xml.etree
is part of Python website, I figure it deserves an answer too.
原始xml文件:
<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<rank>4</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama">
<rank>68</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>
假设我想删除 country
节点但保留子节点并将它们分配给 country
的父节点?
Let's say I want to remove country
nodes but keep the children and assign them to the parent of country
?
理想情况下,我想要一个就地"做事而不是创建新树的解决方案.
Ideally, I want a solution that does things "in place" instead of creating a new tree.
我的(非工作)解决方案:
My (non-working) solution:
# Get all parents of `country`
for country_parent in root.findall(".//country/.."):
print(country_parent.tag)
# Some countries could have same parent so get all
# `country` nodes of current parent
for country in country_parent.findall("./country"):
print('\t', country.tag)
# For each child of `country`, assign it to parent
# and then delete it from `parent`
for country_child in country:
print('\t\t', country_child.tag)
country_parent.append(country_child)
country.remove(country_child)
country_parent.remove(country)
tree.write("test_mod.xml")
我的打印语句的输出:
data
country
rank
gdppc
neighbor
country
rank
gdppc
country
rank
gdppc
neighbor
我们马上就可以看到有一个问题:country
缺少标签 year
和一些 neighbor
标签.
Right away we can see there's a problem: country
is missing the tag year
and some neighbor
tags.
结果 .xml
输出:
<data>
<rank>1</rank>
<gdppc>141100</gdppc>
<neighbor direction="W" name="Switzerland" />
<rank>4</rank>
<gdppc>59900</gdppc>
<rank>68</rank>
<gdppc>13600</gdppc>
<neighbor direction="E" name="Colombia" />
</data>
这显然是错误的.
问题:为什么会发生这种情况?
QUESTION: Why does this happen?
我可以想象这是由于附加/删除破坏了列表的某些内容,即我已经无效"了类似于迭代器的列表.
I can imagine it's from the appending/removing breaking something with the list i.e. I've "invalidated" the list similar to iterator.
推荐答案
从你的程序中删除这一行:
Remove this line from your program:
country.remove(country_child)
xml.etree.ElementTree.Element
的迭代基本上传递到子元素的 list
.在迭代期间修改该列表会产生奇怪的结果.
The iteration of an xml.etree.ElementTree.Element
is essentially passed through to the list
of sub-elements. Modifying that list during iteration will yield odd results.
这篇关于python xml.etree - 删除节点但保留子节点(将子节点分配给祖父母)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!