Python json解析器允许重复的键 [英] Python json parser allow duplicate keys

查看:151
本文介绍了Python json解析器允许重复的键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要解析不幸的json文件,不遵循原型。我有两个数据问题,但是我已经找到了一个解决方法,所以我最后提到它,也许有人可以帮助那里。

I need to parse a json file which unfortunately for me, does not follow the prototype. I have two issues with the data, but i've already found a workaround for it so i'll just mention it at the end, maybe someone can help there as well.

所以我需要解析这样的条目:

So i need to parse entries like this:

    "Test":{
        "entry":{
            "Type":"Something"
                },
        "entry":{
            "Type":"Something_Else"
                }
           }, ...

json默认解析器更新字典,因此仅使用最后一个条目。我也不知何故地存储另一个,我不知道如何做到这一点。我还要按照他们在文件中出现的相同顺序将密钥存储在几个字典中,这就是为什么我使用OrderedDict这样做。它的工作正常,所以如果有任何方法来扩大这个与重复的条目我会很感激。

The json default parser updates the dictionary and therfore uses only the last entry. I HAVE to somehow store the other one as well, and i have no idea how to do this. I also HAVE to store the keys in the several dictionaries in the same order they appear in the file, thats why i am using an OrderedDict to do so. it works fine, so if there is any way to expand this with the duplicate entries i'd be grateful.

我的第二个问题是,这个相同的json文件包含条目如下:

My second issue is that this very same json file contains entries like that:

         "Test":{
                   {
                       "Type":"Something"
                   }
                }

Json.load()函数到达时会引发异常在json文件中的行。我唯一的办法是自己手动删除内部支架。

Json.load() function raises an exception when it reaches that line in the json file. The only way i worked around this was to manually remove the inner brackets myself.

提前感谢

推荐答案

您可以使用 JSONDecoder.object_pairs_hook 自定义 JSONDecoder 解码对象。这个钩子函数将被传递一个你通常做一些处理的(key,value)对的列表,然后变成一个 dict

You can use JSONDecoder.object_pairs_hook to customize how JSONDecoder decodes objects. This hook function will be passed a list of (key, value) pairs that you usually do some processing on, and then turn into a dict.

但是,由于Python字典不允许重复的键(并且您根本无法更改),您可以返回对不变在您解码JSON时,在钩子中获取(key,value)对的嵌套列表:

However, since Python dictionaries don't allow for duplicate keys (and you simply can't change that), you can return the pairs unchanged in the hook and get a nested list of (key, value) pairs when you decode your JSON:

from json import JSONDecoder

def parse_object_pairs(pairs):
    return pairs


data = """
{"foo": {"baz": 42}, "foo": 7}
"""

decoder = JSONDecoder(object_pairs_hook=parse_object_pairs)
obj = decoder.decode(data)
print obj

输出:

[(u'foo', [(u'baz', 42)]), (u'foo', 7)]

您如何使用此数据结构取决于您。如上所述,Python字典不会允许重复的键,并没有办法。你甚至会根据一个键进行查找? dct [key] 将是模糊的。

How you use this data structure is up to you. As stated above, Python dictionaries won't allow for duplicate keys, and there's no way around that. How would you even do a lookup based on a key? dct[key] would be ambiguous.

所以你可以实现自己的逻辑来处理查找你希望它可以工作,或者实现某种避免冲突,使得键不是唯一的,而然后从你的嵌套列表中创建一个字典。

So you can either implement your own logic to handle a lookup the way you expect it to work, or implement some sort of collision avoidance to make keys unique if they're not, and then create a dictionary from your nested list.

修改:由于您表示要修改重复键以使其独一无二,因此您可以这样做: / p>

Edit: Since you said you would like to modify the duplicate key to make it unique, here's how you'd do that:

from collections import OrderedDict
from json import JSONDecoder


def make_unique(key, dct):
    counter = 0
    unique_key = key

    while unique_key in dct:
        counter += 1
        unique_key = '{}_{}'.format(key, counter)
    return unique_key


def parse_object_pairs(pairs):
    dct = OrderedDict()
    for key, value in pairs:
        if key in dct:
            key = make_unique(key, dct)
        dct[key] = value

    return dct


data = """
{"foo": {"baz": 42, "baz": 77}, "foo": 7, "foo": 23}
"""

decoder = JSONDecoder(object_pairs_hook=parse_object_pairs)
obj = decoder.decode(data)
print obj

输出:

OrderedDict([(u'foo', OrderedDict([(u'baz', 42), ('baz_1', 77)])), ('foo_1', 7), ('foo_2', 23)])

make_unique 函数负责返回无冲突的密钥。在这个例子中,它只是后缀为 _n ,其中 n 是增量式计数器 - 只是根据您的需要进行调整。

The make_unique function is responsible for returning a collision-free key. In this example it just suffixes the key with _n where n is an incremental counter - just adapt it to your needs.

由于 object_pairs_hook 按照他们在JSON文档中显示的顺序完全接收对,还可以保留该订单使用 OrderedDict ,我也包括了。

Because the object_pairs_hook receives the pairs exactly in the order they appear in the JSON document, it's also possible to preserve that order by using an OrderedDict, I included that as well.

这篇关于Python json解析器允许重复的键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆