有助于理解 json(dict) 结构的函数 [英] Functions that help to understand json(dict) structure

查看:47
本文介绍了有助于理解 json(dict) 结构的函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我还没有发现有什么方法可以做到这一点.假设我收到一个这样的 JSON 对象:

I haven't found is there a way to do this. Let's say I recieve a JSON object like this:

{'1_data':{'4_data':[{'5_data':'hooray'}, {'3_data':'hooray2'}], '2_data':[]}}

很难说,我应该如何从 3_data key 中获取值:data['1_data']['4_data'][1]['3_data']

It's hard to instantly say, how should I get value from 3_data key: data['1_data']['4_data'][1]['3_data']

我知道pprint,它有助于理解一点结构.但有时数据量很大,需要时间

I know about pprint, it helps to understand structure a bit. But sometimes data is huge, and it takes time

有什么方法可以帮助我吗?

Are there any methods that may help me with that?

推荐答案

这里有一系列递归生成器,可用于搜索由字典和列表组成的对象.find_key 生成一个元组,其中包含指向您传入的键的字典键和列表索引的列表;元组还包含与该键关联的值.因为它是一个生成器,如果需要的话,它会在对象包含多个匹配的键时找到所有匹配的键.

Here are a family of recursive generators that can be used to search through an object composed of dicts and lists. find_key yields a tuple containing a list of the dictionary keys and list indices that lead to the key that you pass in; the tuple also contains the value associated with that key. Because it's a generator it will find all matching keys if the object contains multiple matching keys, if desired.

def find_key(obj, key):
    if isinstance(obj, dict):
        yield from iter_dict(obj, key, [])
    elif isinstance(obj, list):
        yield from iter_list(obj, key, [])

def iter_dict(d, key, indices):
    for k, v in d.items():
        if k == key:
            yield indices + [k], v
        if isinstance(v, dict):
            yield from iter_dict(v, key, indices + [k])
        elif isinstance(v, list):
            yield from iter_list(v, key, indices + [k])

def iter_list(seq, key, indices):
    for k, v in enumerate(seq):
        if isinstance(v, dict):
            yield from iter_dict(v, key, indices + [k])
        elif isinstance(v, list):
            yield from iter_list(v, key, indices + [k])

# test

data = {
    '1_data': {
        '4_data': [
            {'5_data': 'hooray'},
            {'3_data': 'hooray2'}
        ], 
        '2_data': []
    }
}

for t in find_key(data, '3_data'):
    print(t)

输出

(['1_data', '4_data', 1, '3_data'], 'hooray2')

<小时>

要获得单个键列表,您可以将 find_key 传递给 next 函数.如果你想使用一个键列表来获取关联的值,你可以使用一个简单的 for 循环.


To get a single key list you can pass find_key to the next function. And if you want to use a key list to fetch the associated value you can use a simple for loop.

seq, val = next(find_key(data, '3_data'))
print('seq:', seq, 'val:', val)

obj = data
for k in seq:
    obj = obj[k]
print('obj:', obj, obj == val)

输出

seq: ['1_data', '4_data', 1, '3_data'] val: hooray2
obj: hooray2 True

如果键可能丢失,则给 next 一个合适的默认元组.例如:

If the key may be missing, then give next an appropriate default tuple. Eg:

seq, val = next(find_key(data, '6_data'), ([], None))
print('seq:', seq, 'val:', val)
if seq:
    obj = data
    for k in seq:
        obj = obj[k]
    print('obj:', obj, obj == val)

输出

seq: [] val: None

<小时>

请注意,此代码适用于 Python 3.要在 Python 2 上运行它,您需要替换所有 yield from 语句,例如 replace


Note that this code is for Python 3. To run it on Python 2 you need to replace all the yield from statements, eg replace

yield from iter_dict(obj, key, [])

for u in iter_dict(obj, key, []):
    yield u

<小时>

工作原理

要了解此代码的工作原理,您需要熟悉递归并使用 Python 生成器.您可能还会发现此页面很有帮助:了解 Python 中的生成器;网上还有各种 Python 生成器教程.


How it works

To understand how this code works you need to be familiar with recursion and with Python generators. You may also find this page helpful: Understanding Generators in Python; there are also various Python generators tutorials available online.

json.loadjson.loads 返回的 Python 对象一般是 dict,但也可以是列表.我们将该对象作为 obj 参数以及我们要定位的 key 字符串传递给 find_key 生成器.find_key 然后根据需要调用 iter_dictiter_list,向它们传递对象、键和空列表 indices,用于收集字典键并列出指向我们想要的键的索引.

The Python object returned by json.load or json.loads is generally a dict, but it can also be a list. We pass that object to the find_key generator as the obj arg, along with the key string that we want to locate. find_key then calls either iter_dict or iter_list, as appropriate, passing them the object, the key, and an empty list indices, which is used to collect the dict keys and list indices that lead to the key we want.

iter_dict 在其 d dict arg 的顶层迭代每个 (k, v) 对.如果 k 匹配我们正在寻找的键,那么当前的 indices 列表将被生成,并附加 k 以及相关联的值.因为 iter_dict 是递归的,所以产生的(索引列表,值)对被传递到前一级递归,最终到达 find_key 然后到代码称为 find_key.请注意,这是基本情况".我们的递归:它是确定此递归路径是否指向我们想要的键的代码部分.如果递归路径从未找到与我们正在寻找的键匹配的键,那么该递归路径不会向 indices 添加任何内容,并且它将终止而不产生任何内容.

iter_dict iterates over each (k, v) pair at the top level of its d dict arg. If k matches the key we're looking for then the current indices list is yielded with k appended to it, along with the associated value. Because iter_dict is recursive the yielded (indices list, value) pairs get passed up to the previous level of recursion, eventually making their way up to find_key and then to the code that called find_key. Note that this is the "base case" of our recursion: it's the part of the code that determines whether this recursion path leads to the key we want. If a recursion path never finds a key matching the key we're looking for then that recursion path won't add anything to indices and it will terminate without yielding anything.

如果当前的 v 是一个 dict,那么我们需要检查它包含的所有 (key, value) 对.我们通过对 iter_dict 进行递归调用来做到这一点,传递 v 是它的起始对象和当前的 indices 列表.如果当前的 v 是一个列表,我们会调用 iter_list,将相同的参数传递给它.

If the current v is a dict, then we need to examine all the (key, value) pairs it contains. We do that by making a recursive call to iter_dict, passing that v is its starting object and the current indices list. If the current v is a list we instead call iter_list, passing it the same args.

iter_listiter_dict 的工作原理类似,除了列表没有任何键,它只包含值,所以我们不执行 k == key 测试,我们只是递归到原始列表包含的任何字典或列表.

iter_list works similarly to iter_dict except that a list doesn't have any keys, it only contains values, so we don't perform the k == key test, we just recurse into any dicts or lists that the original list contains.

这个过程的最终结果是,当我们对 find_key 进行迭代时,我们得到了 (indices, value) 对,其中每个 indices 列表是字典键的序列和列出在具有我们所需键的 dict 项中成功终止的索引,并且 value 是与该特定键关联的值.

The end result of this process is that when we iterate over find_key we get pairs of (indices, value) where each indices list is the sequence of dict keys and list indices that succesfully terminate in a dict item with our desired key, and value is the value associated with that particular key.

如果您想查看此代码的其他一些使用示例,请参阅如何修改嵌套 Json 的键如何选择深度嵌套的键:python 字典中的值.

If you'd like to see some other examples of this code in use please see how to modify the key of a nested Json and How can I select deeply nested key:values from dictionary in python.

另请查看我的新的、更精简的show_indices 函数.

Also take look at my new, more streamlined show_indices function.

这篇关于有助于理解 json(dict) 结构的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆