如何重写这个函数来实现OrderedDict? [英] How can this function be rewritten to implement OrderedDict?

查看:522
本文介绍了如何重写这个函数来实现OrderedDict?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下函数,将XML文件解析为字典。



不幸的是,由于Python字典没有排序,我无法



我如何改变它,所以它输出一个有序的字典,反映的节点的原始顺序循环与'for'。

  def simplexml_load_file(file):
import collections
from lxml import etree

tree = etree.parse(file)
root = tree.getroot()

def xml_to_item(el):
item = None
if el.text :
item = el.text
child_dicts = collections.defaultdict(list)
el.getchildren()中的子节点:$ b​​ $ b child_dicts [child.tag] .append(xml_to_item child))
return dict(child_dicts)或item

def xml_to_dict(el):
return {el.tag:xml_to_item(el)}

return xml_to_dict(root)

x = simplexml_load_file('routines / test.xml')

print x

for y in x ['root ']:
print y

输出:

  {'root':{
'a':['1'],
'aa' {'b':[{'c':['2']},'2']}],
'aaaa':[{'bb':['4']}],
'aaa':['3'],
'aaaaa':['5']
}}

a
aa
aaaa
aaa
aaaaa

如何实现collections.OrderedDict,

 < root> 
< a> 1< / a>
< aa>
< b>
< c> 2< / c>
< / b>
< b> 2< / b>
< / aa>
< aaa> 3< / aaa>
< aaaa>
< bb> 4< / bb>
< / aaaa>
< aaaaa> 5< / aaaaa>
< / root>


解决方案

您可以使用新的 OrderedDict dict 子类,它被添加到2.7版本中的标准库的 collections 模块中。实际上你需要的是不存在的有序的 + defaultdict 组合,但是可以通过子类化 OrderedDict 如下所示:

  import collections 

class OrderedDefaultdict(collections.OrderedDict):
一个以OrderedDict作为其基类的默认描述

def __init __(self,default_factory = None,* args,** kwargs):
如果没有(default_factory是None
或isinstance(default_factory,collections.Callable)):
raise TypeError('first argument must be callable or无)
super(OrderedDefaultdict,self).__ init __(* args,** kwargs)
self.default_factory = default_factory#通过__missing __()调用

def __missing __ ,key):
if self.default_factory is None:
raise KeyError(key,)
self [key] = value = self.default_factory()
返回值

def __reduce __(self):#可选,对于pickle支持
args =(self.default_factory,)if self.default_factory else tuple()
return self .__ class__,args,无,self.iteritems()

def __repr __(self):#optional
return'%s(%r,%r)'%(self .__ class __.__ name__,self.default_factory ,
list(self.iteritems()))

def simplexml_load_file(file):
from lxml import etree

tree = etree.parse文件)
root = tree.getroot()

def xml_to_item(el):
item = el.text或None
child_dicts = OrderedDefaultdict $ b for child in el.getchildren():
child_dicts [child.tag] .append(xml_to_item(child))
return collections.OrderedDict(child_dicts)或item

def xml_to_dict(el):
return {el.tag:xml_to_item(el)}

return xml_to_dict(root)

x = simplexml_load_file('routines / test。 x')
print(x)

for y in x ['root']:
print(y)
pre>

从测试XML文件生成的输出如下所示:



输出:

  {'root':
OrderedDict(
[('a',['1 ']),
('aa',[OrderedDict([('b',[OrderedDict([('c',['2'])])
('aaa',['3']),
('aaaa',[OrderedDict([('bb',['4'])]) 'aaaaa',['5'])
]

}

a
aa
aaa
aaaa
aaaaa

我认为这是接近你想要的。



*如果您的Python版本没有在v2.5中引入的OrderedDict,您可以使用Raymond Hettinger的 Py2.4的有序字典。ActiveState食谱作为基类。



小更新:



添加了一个 __ reduce __()方法,它将允许类的实例被正确地pickle和unpickled。这不是此问题的必要条件,但在类似一栏中出现。


I have the following function which does a crude job of parsing an XML file into a dictionary.

Unfortunately, since Python dictionaries are not ordered, I am unable to cycle through the nodes as I would like.

How do I change this so it outputs an ordered dictionary which reflects the original order of the nodes when looped with 'for'.

def simplexml_load_file(file):
    import collections
    from lxml import etree

    tree = etree.parse(file)
    root = tree.getroot()

    def xml_to_item(el):
        item = None
        if el.text:
            item = el.text
        child_dicts = collections.defaultdict(list)
        for child in el.getchildren():
            child_dicts[child.tag].append(xml_to_item(child))
        return dict(child_dicts) or item

    def xml_to_dict(el):
        return {el.tag: xml_to_item(el)}

    return xml_to_dict(root)

x = simplexml_load_file('routines/test.xml')

print x

for y in x['root']:
    print y

Outputs:

{'root': {
    'a': ['1'],
    'aa': [{'b': [{'c': ['2']}, '2']}],
    'aaaa': [{'bb': ['4']}],
    'aaa': ['3'],
    'aaaaa': ['5']
}}

a
aa
aaaa
aaa
aaaaa

How can i implement collections.OrderedDict so that I can be sure of getting the correct order of the nodes?

XML file for reference:

<root>
    <a>1</a>
    <aa>
        <b>
            <c>2</c>
        </b>
        <b>2</b>
    </aa>
    <aaa>3</aaa>
    <aaaa>
        <bb>4</bb>
    </aaaa>
    <aaaaa>5</aaaaa>
</root>

解决方案

You could use the new OrderedDict dict subclass which was added to the standard library's collections module in version 2.7*. Actually what you need is an Ordered+defaultdict combination which doesn't exist—but it's possible to create one by subclassing OrderedDict as illustrated below:

import collections

class OrderedDefaultdict(collections.OrderedDict):
    """ A defaultdict with OrderedDict as its base class. """

    def __init__(self, default_factory=None, *args, **kwargs):
        if not (default_factory is None
                or isinstance(default_factory, collections.Callable)):
            raise TypeError('first argument must be callable or None')
        super(OrderedDefaultdict, self).__init__(*args, **kwargs)
        self.default_factory = default_factory  # called by __missing__()

    def __missing__(self, key):
        if self.default_factory is None:
            raise KeyError(key,)
        self[key] = value = self.default_factory()
        return value

    def __reduce__(self):  # optional, for pickle support
        args = (self.default_factory,) if self.default_factory else tuple()
        return self.__class__, args, None, None, self.iteritems()

    def __repr__(self):  # optional
        return '%s(%r, %r)' % (self.__class__.__name__, self.default_factory,
                               list(self.iteritems()))

def simplexml_load_file(file):
    from lxml import etree

    tree = etree.parse(file)
    root = tree.getroot()

    def xml_to_item(el):
        item = el.text or None
        child_dicts = OrderedDefaultdict(list)
        for child in el.getchildren():
            child_dicts[child.tag].append(xml_to_item(child))
        return collections.OrderedDict(child_dicts) or item

    def xml_to_dict(el):
        return {el.tag: xml_to_item(el)}

    return xml_to_dict(root)

x = simplexml_load_file('routines/test.xml')
print(x)

for y in x['root']:
    print(y)

The output produced from your test XML file looks like this:

Output:

{'root':
    OrderedDict(
        [('a', ['1']),
         ('aa', [OrderedDict([('b', [OrderedDict([('c', ['2'])]), '2'])])]),
         ('aaa', ['3']),
         ('aaaa', [OrderedDict([('bb', ['4'])])]),
         ('aaaaa', ['5'])
        ]
    )
}

a
aa
aaa
aaaa
aaaaa

Which I think is close to what you want.

*If your version of Python doesn't have OrderedDict, which was introduced in v2.5 you may be able use Raymond Hettinger's Ordered Dictionary for Py2.4 ActiveState recipe as a base class instead.

Minor update:

Added a __reduce__() method which will allow the instances of the class to be pickled and unpickled properly. This wasn't necessary for this question, but came up in similar one.

这篇关于如何重写这个函数来实现OrderedDict?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆