提取在嵌套字典和列表中找到的叶子值集(不包括“无") [英] Extract set of leaf values found in nested dicts and lists excluding None
问题描述
我有一个从YAML读取的嵌套结构,该结构由嵌套列表和/或嵌套dict或两者在各种嵌套级别的混合组成.可以假定该结构不包含任何递归对象.
I have a nested structure read from YAML which is composed of nested lists and/or nested dicts or a mix of both at various levels of nesting. It can be assumed that the structure doesn't contain any recursive objects.
如何仅从其中提取叶值?另外,我不需要任何None
值.叶子值包含我所关心的所有字符串.考虑到结构的最大深度不足以超过堆栈递归限制,可以使用递归.可选地,生成器也可以.
How do I extract from it the leaf values only? Also, I don't want any None
value. The leaf values contain strings which is all I care for. It's okay for recursion to be used, considering that the maximum depth of the structure is not large enough to exceed stack recursion limits. A generator would optionally also be fine.
存在类似的问题,这些问题涉及扁平化列表或字典,但不能同时包含两者.另外,如果拼合一个字典,他们还会返回我并不需要的拼合键,并且可能会导致名称冲突.
There exist similar questions which deal with flattening lists or dicts, but not a mix of both. Alternatively, if flattening a dict, they also return the flattened keys which I don't really need, and risk name conflicts.
我尝试了 more_itertools.collapse
,但是示例仅显示它可用于嵌套列表,而不能与字典和列表混合使用.
I tried more_itertools.collapse
but its examples only show it to work with nested lists, and not with a mix of dicts and lists.
struct1 = {
"k0": None,
"k1": "v1",
"k2": ["v0", None, "v1"],
"k3": ["v0", ["v1", "v2", None, ["v3"], ["v4", "v5"], []]],
"k4": {"k0": None},
"k5": {"k1": {"k2": {"k3": "v3", "k4": "v6"}, "k4": {}}},
"k6": [{}, {"k1": "v7"}, {"k2": "v8", "k3": "v9", "k4": {"k5": {"k6": "v10"}, "k7": {}}}],
"k7": {
"k0": [],
"k1": ["v11"],
"k2": ["v12", "v13"],
"k3": ["v14", ["v15"]],
"k4": [["v16"], ["v17"]],
"k5": ["v18", ["v19", "v20", ["v21", "v22", []]]],
},
}
struct2 = ["aa", "bb", "cc", ["dd", "ee", ["ff", "gg"], None, []]]
预期产量
struct1_leaves = {f"v{i}" for i in range(23)}
struct2_leaves = {f"{s}{s}" for s in "abcdefg"}
推荐答案
这是参考答案的改编,以供使用一个具有单个set
的内部函数.它还使用递归为问题中包含的样本输入生成预期输出.这样可以避免将每个叶子都传递给整个调用堆栈.
This is an adaption of the reference answer to use an inner function with a single set
. It also uses recursion to produce the expected outputs for the sample inputs included in the question. It avoids passing every leaf through the entire call stack.
from typing import Any, Set
def leaves(struct: Any) -> Set[Any]:
"""Return a set of leaf values found in nested dicts and lists excluding None values."""
# Ref: https://stackoverflow.com/a/59832594/
values = set()
def add_leaves(struct_: Any) -> None:
if isinstance(struct_, dict):
for sub_struct in struct_.values():
add_leaves(sub_struct)
elif isinstance(struct_, list):
for sub_struct in struct_:
add_leaves(sub_struct)
elif struct_ is not None:
values.add(struct_)
add_leaves(struct)
return values
信用:堆溢出
这篇关于提取在嵌套字典和列表中找到的叶子值集(不包括“无")的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!