如何重写此函数以实现 OrderedDict? [英] How can this function be rewritten to implement OrderedDict?

查看：21 发布时间：2021/12/3 14:23:35 python xml collections lxml

本文介绍了如何重写此函数以实现 OrderedDict?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下函数可以将 XML 文件解析为字典.

I have the following function which does a crude job of parsing an XML file into a dictionary.

不幸的是，由于 Python 词典没有排序，我无法按照自己的意愿循环浏览节点.

Unfortunately, since Python dictionaries are not ordered, I am unable to cycle through the nodes as I would like.

如何更改它以输出一个有序字典，该字典反映了使用 for 循环时节点的原始顺序.

How do I change this so it outputs an ordered dictionary which reflects the original order of the nodes when looped with for.

def simplexml_load_file(file):
    import collections
    from lxml import etree

    tree = etree.parse(file)
    root = tree.getroot()

    def xml_to_item(el):
        item = None
        if el.text:
            item = el.text
        child_dicts = collections.defaultdict(list)
        for child in el.getchildren():
            child_dicts[child.tag].append(xml_to_item(child))
        return dict(child_dicts) or item

    def xml_to_dict(el):
        return {el.tag: xml_to_item(el)}

    return xml_to_dict(root)

x = simplexml_load_file('routines/test.xml')

print x

for y in x['root']:
    print y

输出:

{'root': {
    'a': ['1'],
    'aa': [{'b': [{'c': ['2']}, '2']}],
    'aaaa': [{'bb': ['4']}],
    'aaa': ['3'],
    'aaaaa': ['5']
}}

a
aa
aaaa
aaa
aaaaa

如何实现 collections.OrderedDict 以确保获得正确的节点顺序?

How can I implement collections.OrderedDict so that I can be sure of getting the correct order of the nodes?

供参考的 XML 文件:

XML file for reference:

<root>
    <a>1</a>
    <aa>
        <b>
            <c>2</c>
        </b>
        <b>2</b>
    </aa>
    <aaa>3</aaa>
    <aaaa>
        <bb>4</bb>
    </aaaa>
    <aaaaa>5</aaaaa>
</root>

推荐答案

您可以使用新的 OrderedDictdict 子类，在 2.7 版中添加到标准库的 collections 模块^✶.实际上你需要的是一个不存在的 Ordered+defaultdict 组合——但是可以通过子类化 OrderedDict 来创建一个组合，如下图所示:

You could use the new OrderedDictdict subclass which was added to the standard library's collections module in version 2.7^✶. Actually what you need is an Ordered+defaultdict combination which doesn't exist — but it's possible to create one by subclassing OrderedDict as illustrated below:

^{✶ 如果您的 Python 版本没有 OrderedDict，您应该可以使用 Raymond Hettinger 的 Py2.4 的有序字典 ActiveState recipe 作为基类.}

^{✶ If your version of Python doesn't have OrderedDict, you should be able use Raymond Hettinger's Ordered Dictionary for Py2.4 ActiveState recipe as the base class instead.}

import collections

class OrderedDefaultdict(collections.OrderedDict):
    """ A defaultdict with OrderedDict as its base class. """

    def __init__(self, default_factory=None, *args, **kwargs):
        if not (default_factory is None or callable(default_factory)):
            raise TypeError('first argument must be callable or None')
        super(OrderedDefaultdict, self).__init__(*args, **kwargs)
        self.default_factory = default_factory  # called by __missing__()

    def __missing__(self, key):
        if self.default_factory is None:
            raise KeyError(key,)
        self[key] = value = self.default_factory()
        return value

    def __reduce__(self):  # Optional, for pickle support.
        args = (self.default_factory,) if self.default_factory else tuple()
        return self.__class__, args, None, None, iter(self.items())

    def __repr__(self):  # Optional.
        return '%s(%r, %r)' % (self.__class__.__name__, self.default_factory, self.items())

def simplexml_load_file(file):
    from lxml import etree

    tree = etree.parse(file)
    root = tree.getroot()

    def xml_to_item(el):
        item = el.text or None
        child_dicts = OrderedDefaultdict(list)
        for child in el.getchildren():
            child_dicts[child.tag].append(xml_to_item(child))
        return collections.OrderedDict(child_dicts) or item

    def xml_to_dict(el):
        return {el.tag: xml_to_item(el)}

    return xml_to_dict(root)

x = simplexml_load_file('routines/test.xml')
print(x)

for y in x['root']:
    print(y)

从您的测试 XML 文件生成的输出如下所示:

The output produced from your test XML file looks like this:

{'root':
    OrderedDict(
        [('a', ['1']),
         ('aa', [OrderedDict([('b', [OrderedDict([('c', ['2'])]), '2'])])]),
         ('aaa', ['3']),
         ('aaaa', [OrderedDict([('bb', ['4'])])]),
         ('aaaaa', ['5'])
        ]
    )
}

a
aa
aaa
aaaa
aaaaa

我认为这与您想要的很接近.

Which I think is close to what you want.

小更新:

添加了一个 __reduce__() 方法，该方法将允许类的实例被正确地pickle 和unpickled.对于这个问题，这不是必需的，但出现在类似问题中.

Added a __reduce__() method which will allow the instances of the class to be pickled and unpickled properly. This wasn't necessary for this question, but came up in a similar one.

这篇关于如何重写此函数以实现 OrderedDict?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何重写此函数以实现 OrderedDict? [英] How can this function be rewritten to implement OrderedDict?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何重写此函数以实现 OrderedDict? [英] How can this function be rewritten to implement OrderedDict?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭