在JSON嵌套对象中搜索组合 [英] Search for combinations in JSON nested object

查看:24
本文介绍了在JSON嵌套对象中搜索组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大型JSON对象。其中一部分是:

data = [
{  
   'make': 'dacia',
   'model': 'x',
   'version': 'A',
   'typ': 'sedan',
   'infos': [
            {'id': 1, 'name': 'steering wheel problems'}, 
            {'id': 32, 'name': ABS errors}
   ]
},
{  
   'make': 'nissan',
   'model': 'z',
   'version': 'B',
   'typ': 'coupe',
   'infos': [
         {'id': 3,'name': throttle problems'}, 
         {'id': 56, 'name': 'broken handbreak'}, 
         {'id': 11, ;'name': missing seatbelts'}
   ]
}
]

我创建了我的JSON中可能出现的所有可能的信息组合的列表(一辆车有时只能有一种信息,而另一辆车可能有很多信息):

inf = list(set(i.get'name' for d in data for i in (d['infos'] if isinstance(d['infos'], list) else [d['infos']]))
inf_comb = [combo for n in range(1, len(infos+1)) for combo in itertools.combinations(infos, n)]
infos_combo = [list(elem) for elem in inf_comb]

现在我需要迭代整个JSONdata,并计算infos_combo的某个集合发生了多少次,因此我创建了代码:

tab = []
s = 0
for x in infos_combo:
   s = sum([1 for k in data if (([i['name'] for i in (k['infos'] if isinstance(k['infos'], list) else [k['infos']])] == x))])
   if s!= 0:
     tab.append({'infos': r, 'sum': s})
print(tab)

我面临的问题是tab只返回我期望的一些元素--在我的JSON对象中出现的组合要多得多,必须进行计数,但我无法获得它们。如何解决这个问题?

推荐答案

好的,那么首先您需要从您的json数据中获取所有的实际信息,如下所示:

infos = [
    [i["name"] for i in d["infos"]] if isinstance(d["infos"], list) else d["infos"]
    for d in data
]

这将为您提供类似以下内容的内容,我们稍后将使用这些内容:

[['steering wheel problems', 'ABS errors'], ['throttle problems', 'broken handbreak', 'missing seatbelts']]

现在,要获得所有组合,我们首先需要通过展平信息数组并剔除重复项来进行处理:

unique_infos = [x for l in infos for x in l]

要获取所有组合:

infos_combo = itertools.chain.from_iterable(
    itertools.combinations(unique_infos, r) for r in range(len(unique_infos) + 1)
)

将产生以下结果:

()
('steering wheel problems',)
('ABS errors',)
('throttle problems',)
('broken handbreak',)
('missing seatbelts',)
('steering wheel problems', 'ABS errors')
('steering wheel problems', 'throttle problems')
('steering wheel problems', 'broken handbreak')
...
# truncated code too long
...
('steering wheel problems', 'throttle problems', 'broken handbreak', 'missing seatbelts')
('ABS errors', 'throttle problems', 'broken handbreak', 'missing seatbelts')
('steering wheel problems', 'ABS errors', 'throttle problems', 'broken handbreak', 'missing seatbelts')

之后,需要对原始信息列表中的每个组合进行计数:

occurences = {}
for combo in infos_combo:
    occurences[combo] = infos.count(list(combo))

print(occurences)

完整代码:

import itertools
import sys

data = [
    {
        "make": "dacia",
        "model": "x",
        "version": "A",
        "typ": "sedan",
        "infos": [
            {"id": 1, "name": "steering wheel problems"},
            {"id": 32, "name": "ABS errors"},
        ],
    },
    {
        "make": "nissan",
        "model": "z",
        "version": "B",
        "typ": "coupe",
        "infos": [
            {"id": 3, "name": "throttle problems"},
            {"id": 56, "name": "broken handbreak"},
            {"id": 11, "name": "missing seatbelts"},
        ],
    },
]

infos = [
    [i["name"] for i in d["infos"]] if isinstance(d["infos"], list) else d["infos"]
    for d in data
]

unique_infos = [x for l in infos for x in l]

infos_combo = itertools.chain.from_iterable(
    itertools.combinations(unique_infos, r) for r in range(len(unique_infos) + 1)
)

occurences = {}
for combo in infos_combo:
    occurences[combo] = infos.count(list(combo))

print(occurences)

这篇关于在JSON嵌套对象中搜索组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆