合并带有嵌套元素的xml文件,而无需外部库 [英] Merge xml files with nested elements without external libraries

查看:71
本文介绍了合并带有嵌套元素的xml文件,而无需外部库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Python将多个XML文件合并在一起,并且不使用任何外部库。 XML文件具有嵌套的元素。

I am trying to merge multiple XML files together using Python and no external libraries. The XML files have nested elements.

示例文件1:

<root>
  <element1>textA</element1>
  <elements>
    <nested1>text now</nested1>
  </elements>
</root>

样本文件2:

<root>
  <element2>textB</element2>
  <elements>
    <nested1>text after</nested1>
    <nested2>new text</nested2>
  </elements>
</root>

我想要什么:

<root>
  <element1>textA</element1>    
  <element2>textB</element2>  
  <elements>
    <nested1>text after</nested1>
    <nested2>new text</nested2>
  </elements>  
</root>  

我尝试过的操作:

来自此答案

from xml.etree import ElementTree as et
def combine_xml(files):
    first = None
    for filename in files:
        data = et.parse(filename).getroot()
        if first is None:
            first = data
        else:
            first.extend(data)
    if first is not None:
        return et.tostring(first)

我得到的东西:

<root>
  <element1>textA</element1>
  <elements>
    <nested1>text now</nested1>
  </elements>
  <element2>textB</element2>
  <elements>
    <nested1>text after</nested1>
    <nested2>new text</nested2>
  </elements>
</root>

我希望您能看到并理解我的问题。我正在寻找一个适当的解决方案,任何指导都将是很棒的。

I hope you can see and understand my problem. I am looking for a proper solution, any guidance would be wonderful.

为澄清这个问题,使用我目前拥有的解决方案,不会合并嵌套元素。

To clarify the problem, using the current solution that I have, nested elements are not merged.

推荐答案

您发布的代码正在执行的操作是合并所有元素,而不管是否存在具有相同标签的元素。因此,您需要遍历元素并按照您认为合适的方式手动检查和组合它们,因为这不是处理XML文件的标准方法。我无法比代码更好地解释它,所以在这里或多或少地添加了注释:

What the code you posted is doing is combining all the elements regardless of whether or not an element with the same tag already exists. So you need to iterate over the elements and manually check and combine them the way you see fit, because it is not a standard way of handling XML files. I can't explain it better than code, so here it is, more or less commented:

from xml.etree import ElementTree as et

class XMLCombiner(object):
    def __init__(self, filenames):
        assert len(filenames) > 0, 'No filenames!'
        # save all the roots, in order, to be processed later
        self.roots = [et.parse(f).getroot() for f in filenames]

    def combine(self):
        for r in self.roots[1:]:
            # combine each element with the first one, and update that
            self.combine_element(self.roots[0], r)
        # return the string representation
        return et.tostring(self.roots[0])

    def combine_element(self, one, other):
        """
        This function recursively updates either the text or the children
        of an element if another element is found in `one`, or adds it
        from `other` if not found.
        """
        # Create a mapping from tag name to element, as that's what we are fltering with
        mapping = {el.tag: el for el in one}
        for el in other:
            if len(el) == 0:
                # Not nested
                try:
                    # Update the text
                    mapping[el.tag].text = el.text
                except KeyError:
                    # An element with this name is not in the mapping
                    mapping[el.tag] = el
                    # Add it
                    one.append(el)
            else:
                try:
                    # Recursively process the element, and update it in the same way
                    self.combine_element(mapping[el.tag], el)
                except KeyError:
                    # Not in the mapping
                    mapping[el.tag] = el
                    # Just add it
                    one.append(el)

if __name__ == '__main__':
    r = XMLCombiner(('sample1.xml', 'sample2.xml')).combine()
    print '-'*20
    print r

这篇关于合并带有嵌套元素的xml文件,而无需外部库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆