我如何映射到字典而不是列表? [英] how do i map to a dictionary rather than a list?

查看:131
本文介绍了我如何映射到字典而不是列表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下功能,这是将lxml对象映射到字典的基本工作...

i have the following function, which doe a basic job of mapping an lxml object to a dictionary...

from lxml import etree 

tree = etree.parse('file.xml')
root = tree.getroot()

def xml_to_dict(el):
    d={}
    if el.text:
        print '***write tag as string'
        d[el.tag] = el.text
    else:
        d[el.tag] = {}
    children = el.getchildren()
    if children:
        d[el.tag] = map(xml_to_dict, children)
    return d

    v = xml_to_dict(root)

>>>print v
{'root': [{'a': '1'}, {'a': [{'b': '2'}, {'b': '2'}]}, {'aa': '1a'}]}

但我想....

>>>print v
{'root': {'a': ['1', {'b': [2, 2]}], 'aa': '1a'}}

如何重写函数xml_to_dict(el)以便我得到所需的输出?

how do i rewrite the function xml_to_dict(el) so that i get the required output?

这里的xml我正在解析,为了清楚起见。

here's the xml i'm parsing, for clarity.

<root>
    <a>1</a>
    <a>
        <b>2</b>
        <b>2</b>
    </a>
    <aa>1a</aa>
</root>

谢谢:)

推荐答案

好吧, map()将永远返回一个列表,所以简单的答案是不要使用 map() 。相反,通过循环 children 并将 xml_to_dict(child)的结果分配给您要使用的字典键。看起来您希望使用该标签作为关键字,并将该值作为该标签的项目列表,因此将成为以下内容:

Well, map() will always return a list, so the easy answer is "don't use map()". Instead, build a dictionary like you already are, by looping over children and assigning the result of xml_to_dict(child) to the dictionary key you want to use. It looks like you want to use the tag as the key and have the value be a list of items with that tag, so it would become something like:

import collections
from lxml import etree

tree = etree.parse('file.xml')
root = tree.getroot()

def xml_to_dict(el):
    d={}
    if el.text:
        print '***write tag as string'
        d[el.tag] = el.text
    child_dicts = collections.defaultdict(list)
    for child in el.getchildren():
        child_dicts[child.tag].append(xml_to_dict(child))
    if child_dicts:
        d[el.tag] = child_dicts
    return d

xml_to_dict(root)

将dict中的标签条目留作defaultdict;如果你想要一个正常的dict由于某些原因,使用 d [el.tag] = dict(child_dicts)。请注意,像以前一样,如果标签同时具有文本和子句,则文本将不会出现在dict中。您可能想考虑一下您的dict的不同布局来应对这种情况。

This leaves the tag entry in the dict as a defaultdict; if you want a normal dict for some reason, use d[el.tag] = dict(child_dicts). Note that, like before, if a tag has both text and children the text won't appear in the dict. You may want to think about a different layout for your dict to cope with that.

编辑:

在转义问题中产生输出的代码不会在 xml_to_dict 中递归,因为您只需要外部元素的dict,而不是所有子标签所以,你会使用如下的东西:

Code that would produce the output in your rephrased question wouldn't recurse in xml_to_dict -- because you only want a dict for the outer element, not for all child tags. So, you'd use something like:

import collections
from lxml import etree

tree = etree.parse('file.xml')
root = tree.getroot()

def xml_to_item(el):
    if el.text:
        print '***write tag as string'
        item = el.text
    child_dicts = collections.defaultdict(list)
    for child in el.getchildren():
        child_dicts[child.tag].append(xml_to_item(child))
    return dict(child_dicts) or item

def xml_to_dict(el):
    return {el.tag: xml_to_item(el)}

print xml_to_dict(root)

处理带有文本和子句的标签,并将 collections.defaultdict(list)转换为正常的dict,因此输出(几乎)符合您的期望:

This still doesn't handle tags with both text and children sanely, and it turns the collections.defaultdict(list) into a normal dict so the output is (almost) as you expect:

***write tag as string
***write tag as string
***write tag as string
***write tag as string
***write tag as string
***write tag as string
{'root': {'a': ['1', {'b': ['2', '2']}], 'aa': ['1a']}}

(如果你真的想要整数而不是文本数据的字符串在 b 标签中,您必须明确将其转换为整数。)

(If you really want integers instead of strings for the text data in the b tags, you'll have to explicitly turn them into integers somehow.)

这篇关于我如何映射到字典而不是列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆