有助于理解 json(dict) 结构的函数 [英] Functions that help to understand json(dict) structure
问题描述
我还没有发现有什么方法可以做到这一点.假设我收到一个这样的 JSON 对象:
I haven't found is there a way to do this. Let's say I recieve a JSON object like this:
{'1_data':{'4_data':[{'5_data':'hooray'}, {'3_data':'hooray2'}], '2_data':[]}}
很难说,我应该如何从 3_data key 中获取值:data['1_data']['4_data'][1]['3_data']
It's hard to instantly say, how should I get value from 3_data key: data['1_data']['4_data'][1]['3_data']
我知道pprint
,它有助于理解一点结构.但有时数据量很大,需要时间
I know about pprint
, it helps to understand structure a bit.
But sometimes data is huge, and it takes time
有什么方法可以帮助我吗?
Are there any methods that may help me with that?
推荐答案
这里有一系列递归生成器,可用于搜索由字典和列表组成的对象.find_key
生成一个元组,其中包含指向您传入的键的字典键和列表索引的列表;元组还包含与该键关联的值.因为它是一个生成器,如果需要的话,它会在对象包含多个匹配的键时找到所有匹配的键.
Here are a family of recursive generators that can be used to search through an object composed of dicts and lists. find_key
yields a tuple containing a list of the dictionary keys and list indices that lead to the key that you pass in; the tuple also contains the value associated with that key. Because it's a generator it will find all matching keys if the object contains multiple matching keys, if desired.
def find_key(obj, key):
if isinstance(obj, dict):
yield from iter_dict(obj, key, [])
elif isinstance(obj, list):
yield from iter_list(obj, key, [])
def iter_dict(d, key, indices):
for k, v in d.items():
if k == key:
yield indices + [k], v
if isinstance(v, dict):
yield from iter_dict(v, key, indices + [k])
elif isinstance(v, list):
yield from iter_list(v, key, indices + [k])
def iter_list(seq, key, indices):
for k, v in enumerate(seq):
if isinstance(v, dict):
yield from iter_dict(v, key, indices + [k])
elif isinstance(v, list):
yield from iter_list(v, key, indices + [k])
# test
data = {
'1_data': {
'4_data': [
{'5_data': 'hooray'},
{'3_data': 'hooray2'}
],
'2_data': []
}
}
for t in find_key(data, '3_data'):
print(t)
输出
(['1_data', '4_data', 1, '3_data'], 'hooray2')
<小时>
要获得单个键列表,您可以将 find_key
传递给 next
函数.如果你想使用一个键列表来获取关联的值,你可以使用一个简单的 for
循环.
To get a single key list you can pass find_key
to the next
function. And if you want to use a key list to fetch the associated value you can use a simple for
loop.
seq, val = next(find_key(data, '3_data'))
print('seq:', seq, 'val:', val)
obj = data
for k in seq:
obj = obj[k]
print('obj:', obj, obj == val)
输出
seq: ['1_data', '4_data', 1, '3_data'] val: hooray2
obj: hooray2 True
如果键可能丢失,则给 next
一个合适的默认元组.例如:
If the key may be missing, then give next
an appropriate default tuple. Eg:
seq, val = next(find_key(data, '6_data'), ([], None))
print('seq:', seq, 'val:', val)
if seq:
obj = data
for k in seq:
obj = obj[k]
print('obj:', obj, obj == val)
输出
seq: [] val: None
<小时>
请注意,此代码适用于 Python 3.要在 Python 2 上运行它,您需要替换所有 yield from
语句,例如 replace
Note that this code is for Python 3. To run it on Python 2 you need to replace all the yield from
statements, eg replace
yield from iter_dict(obj, key, [])
与
for u in iter_dict(obj, key, []):
yield u
<小时>
工作原理
要了解此代码的工作原理,您需要熟悉递归并使用 Python 生成器.您可能还会发现此页面很有帮助:了解 Python 中的生成器;网上还有各种 Python 生成器教程.
How it works
To understand how this code works you need to be familiar with recursion and with Python generators. You may also find this page helpful: Understanding Generators in Python; there are also various Python generators tutorials available online.
json.load
或 json.loads
返回的 Python 对象一般是 dict,但也可以是列表.我们将该对象作为 obj
参数以及我们要定位的 key
字符串传递给 find_key
生成器.find_key
然后根据需要调用 iter_dict
或 iter_list
,向它们传递对象、键和空列表 indices代码>,用于收集字典键并列出指向我们想要的键的索引.
The Python object returned by json.load
or json.loads
is generally a dict, but it can also be a list. We pass that object to the find_key
generator as the obj
arg, along with the key
string that we want to locate. find_key
then calls either iter_dict
or iter_list
, as appropriate, passing them the object, the key, and an empty list indices
, which is used to collect the dict keys and list indices that lead to the key we want.
iter_dict
在其 d
dict arg 的顶层迭代每个 (k, v) 对.如果 k
匹配我们正在寻找的键,那么当前的 indices
列表将被生成,并附加 k
以及相关联的值.因为 iter_dict
是递归的,所以产生的(索引列表,值)对被传递到前一级递归,最终到达 find_key
然后到代码称为 find_key
.请注意,这是基本情况".我们的递归:它是确定此递归路径是否指向我们想要的键的代码部分.如果递归路径从未找到与我们正在寻找的键匹配的键,那么该递归路径不会向 indices
添加任何内容,并且它将终止而不产生任何内容.
iter_dict
iterates over each (k, v) pair at the top level of its d
dict arg. If k
matches the key we're looking for then the current indices
list is yielded with k
appended to it, along with the associated value. Because iter_dict
is recursive the yielded (indices list, value) pairs get passed up to the previous level of recursion, eventually making their way up to find_key
and then to the code that called find_key
. Note that this is the "base case" of our recursion: it's the part of the code that determines whether this recursion path leads to the key we want. If a recursion path never finds a key matching the key we're looking for then that recursion path won't add anything to indices
and it will terminate without yielding anything.
如果当前的 v
是一个 dict,那么我们需要检查它包含的所有 (key, value) 对.我们通过对 iter_dict
进行递归调用来做到这一点,传递 v
是它的起始对象和当前的 indices
列表.如果当前的 v
是一个列表,我们会调用 iter_list
,将相同的参数传递给它.
If the current v
is a dict, then we need to examine all the (key, value) pairs it contains. We do that by making a recursive call to iter_dict
, passing that v
is its starting object and the current indices
list. If the current v
is a list we instead call iter_list
, passing it the same args.
iter_list
与 iter_dict
的工作原理类似,除了列表没有任何键,它只包含值,所以我们不执行 k == key
测试,我们只是递归到原始列表包含的任何字典或列表.
iter_list
works similarly to iter_dict
except that a list doesn't have any keys, it only contains values, so we don't perform the k == key
test, we just recurse into any dicts or lists that the original list contains.
这个过程的最终结果是,当我们对 find_key
进行迭代时,我们得到了 (indices, value) 对,其中每个 indices
列表是字典键的序列和列出在具有我们所需键的 dict 项中成功终止的索引,并且 value
是与该特定键关联的值.
The end result of this process is that when we iterate over find_key
we get pairs of (indices, value) where each indices
list is the sequence of dict keys and list indices that succesfully terminate in a dict item with our desired key, and value
is the value associated with that particular key.
如果您想查看此代码的其他一些使用示例,请参阅如何修改嵌套 Json 的键 和 如何选择深度嵌套的键:python 字典中的值.
If you'd like to see some other examples of this code in use please see how to modify the key of a nested Json and How can I select deeply nested key:values from dictionary in python.
另请查看我的新的、更精简的show_indices
函数.
Also take look at my new, more streamlined show_indices
function.
这篇关于有助于理解 json(dict) 结构的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!