如何重写此函数以实现 OrderedDict? [英] How can this function be rewritten to implement OrderedDict?
问题描述
我有以下函数可以将 XML 文件解析为字典.
I have the following function which does a crude job of parsing an XML file into a dictionary.
不幸的是,由于 Python 词典没有排序,我无法按照自己的意愿循环浏览节点.
Unfortunately, since Python dictionaries are not ordered, I am unable to cycle through the nodes as I would like.
如何更改它以输出一个有序字典,该字典反映了使用 for
循环时节点的原始顺序.
How do I change this so it outputs an ordered dictionary which reflects the original order of the nodes when looped with for
.
def simplexml_load_file(file):
import collections
from lxml import etree
tree = etree.parse(file)
root = tree.getroot()
def xml_to_item(el):
item = None
if el.text:
item = el.text
child_dicts = collections.defaultdict(list)
for child in el.getchildren():
child_dicts[child.tag].append(xml_to_item(child))
return dict(child_dicts) or item
def xml_to_dict(el):
return {el.tag: xml_to_item(el)}
return xml_to_dict(root)
x = simplexml_load_file('routines/test.xml')
print x
for y in x['root']:
print y
输出:
{'root': {
'a': ['1'],
'aa': [{'b': [{'c': ['2']}, '2']}],
'aaaa': [{'bb': ['4']}],
'aaa': ['3'],
'aaaaa': ['5']
}}
a
aa
aaaa
aaa
aaaaa
如何实现 collections.OrderedDict
以确保获得正确的节点顺序?
How can I implement collections.OrderedDict
so that I can be sure of getting the correct order of the nodes?
供参考的 XML 文件:
XML file for reference:
<root>
<a>1</a>
<aa>
<b>
<c>2</c>
</b>
<b>2</b>
</aa>
<aaa>3</aaa>
<aaaa>
<bb>4</bb>
</aaaa>
<aaaaa>5</aaaaa>
</root>
推荐答案
您可以使用新的 OrderedDict
dict
子类,在 2.7 版中添加到标准库的 collections
模块✶.实际上你需要的是一个不存在的 Ordered
+defaultdict
组合——但是可以通过子类化 OrderedDict
来创建一个组合,如下图所示:
You could use the new OrderedDict
dict
subclass which was added to the standard library's collections
module in version 2.7✶. Actually what you need is an Ordered
+defaultdict
combination which doesn't exist — but it's possible to create one by subclassing OrderedDict
as illustrated below:
✶ 如果您的 Python 版本没有 OrderedDict
,您应该可以使用 Raymond Hettinger 的 Py2.4 的有序字典 ActiveState recipe 作为基类.
✶ If your version of Python doesn't have OrderedDict
, you should be able use Raymond Hettinger's Ordered Dictionary for Py2.4 ActiveState recipe as the base class instead.
import collections
class OrderedDefaultdict(collections.OrderedDict):
""" A defaultdict with OrderedDict as its base class. """
def __init__(self, default_factory=None, *args, **kwargs):
if not (default_factory is None or callable(default_factory)):
raise TypeError('first argument must be callable or None')
super(OrderedDefaultdict, self).__init__(*args, **kwargs)
self.default_factory = default_factory # called by __missing__()
def __missing__(self, key):
if self.default_factory is None:
raise KeyError(key,)
self[key] = value = self.default_factory()
return value
def __reduce__(self): # Optional, for pickle support.
args = (self.default_factory,) if self.default_factory else tuple()
return self.__class__, args, None, None, iter(self.items())
def __repr__(self): # Optional.
return '%s(%r, %r)' % (self.__class__.__name__, self.default_factory, self.items())
def simplexml_load_file(file):
from lxml import etree
tree = etree.parse(file)
root = tree.getroot()
def xml_to_item(el):
item = el.text or None
child_dicts = OrderedDefaultdict(list)
for child in el.getchildren():
child_dicts[child.tag].append(xml_to_item(child))
return collections.OrderedDict(child_dicts) or item
def xml_to_dict(el):
return {el.tag: xml_to_item(el)}
return xml_to_dict(root)
x = simplexml_load_file('routines/test.xml')
print(x)
for y in x['root']:
print(y)
从您的测试 XML 文件生成的输出如下所示:
The output produced from your test XML file looks like this:
{'root':
OrderedDict(
[('a', ['1']),
('aa', [OrderedDict([('b', [OrderedDict([('c', ['2'])]), '2'])])]),
('aaa', ['3']),
('aaaa', [OrderedDict([('bb', ['4'])])]),
('aaaaa', ['5'])
]
)
}
a
aa
aaa
aaaa
aaaaa
我认为这与您想要的很接近.
Which I think is close to what you want.
小更新:
添加了一个 __reduce__()
方法,该方法将允许类的实例被正确地pickle 和unpickled.对于这个问题,这不是必需的,但出现在类似问题中.
Added a __reduce__()
method which will allow the instances of the class to be pickled and unpickled properly. This wasn't necessary for this question, but came up in a similar one.
这篇关于如何重写此函数以实现 OrderedDict?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!