如何使用 flatten_json 递归地展平嵌套的 JSON [英] How to flatten a nested JSON recursively, with flatten_json
本文介绍了如何使用 flatten_json 递归地展平嵌套的 JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
- 该软件包位于 pypi flatten-json 0.1.7 上,可以使用 flatten-json 0.1.7代码>pip install flatten-json
- 此问题特定于软件包的以下组件:
- The package is on pypi flatten-json 0.1.7 and can be installed with
pip install flatten-json
- This question is specific to the following component of the package:
def flatten_json(nested_json: dict, exclude: list=[''], sep: str='_') -> dict:
"""
Flatten a list of nested dicts.
"""
out = dict()
def flatten(x: (list, dict, str), name: str='', exclude=exclude):
if type(x) is dict:
for a in x:
if a not in exclude:
flatten(x[a], f'{name}{a}{sep}')
elif type(x) is list:
i = 0
for a in x:
flatten(a, f'{name}{i}{sep}')
i += 1
else:
out[name[:-1]] = x
flatten(nested_json)
return out
使用递归扁平化嵌套的dicts
- 在 Python 中递归思考
- 在 Python 中展平 JSON 对象
flatten_json
被用来解压一个超过 100000 列的文件flatten_json
has been used to unpack a file that ended up being over 100000 columns- 是的,这个问题不包括那个.但是,如果你安装了
flatten
包,有一个unflatten
方法,但我没有测试过. - Yes, this question doesn't cover that. However, if you install the
flatten
package, there is anunflatten
method, but I haven't tested it. - 此答案侧重于使用
flatten_json
递归地展平嵌套的dict
或JSON
. - This answer focuses on using
flatten_json
to recursively flatten a nesteddict
orJSON
. - 这个答案假设您已经将
JSON
或dict
加载到某个变量(例如文件、api 等)中- 在这种情况下,我们将使用
data
- This answer assumes you already have the
JSON
ordict
loaded into some variable (e.g. file, api, etc.)- In this case we will use
data
- 它接受一个
dict
,如函数类型提示所示.
- It accepts a
dict
, as shown by the function type hint.
- 只是一个字典:
{}
flatten_json(data)
[flatten_json(x) for x in data]
[flatten_json(data[key]) for key in data.keys()]
{'key': [{}, {}, {}]}
:[flatten_json(x) for x in data['key']]
- 我通常将
data
扁平化为pandas.DataFrame
以供进一步分析.- 加载
pandas
和import pandas as pd
- I typically flatten
data
into apandas.DataFrame
for further analysis.- Load
pandas
withimport pandas as pd
{ "id": 1, "class": "c1", "owner": "myself", "metadata": { "m1": { "value": "m1_1", "timestamp": "d1" }, "m2": { "value": "m1_2", "timestamp": "d2" }, "m3": { "value": "m1_3", "timestamp": "d3" }, "m4": { "value": "m1_4", "timestamp": "d4" } }, "a1": { "a11": [ ] }, "m1": {}, "comm1": "COMM1", "comm2": "COMM21529089656387", "share": "xxx", "share1": "yyy", "hub1": "h1", "hub2": "h2", "context": [ ] }
展平 1:
df = pd.DataFrame([flatten_json(data)]) id class owner metadata_m1_value metadata_m1_timestamp metadata_m2_value metadata_m2_timestamp metadata_m3_value metadata_m3_timestamp metadata_m4_value metadata_m4_timestamp comm1 comm2 share share1 hub1 hub2 1 c1 myself m1_1 d1 m1_2 d2 m1_3 d3 m1_4 d4 COMM1 COMM21529089656387 xxx yyy h1 h2
数据 2:
[{ 'accuracy': 17, 'activity': [{ 'activity': [{ 'confidence': 100, 'type': 'STILL' } ], 'timestampMs': '1542652' } ], 'altitude': -10, 'latitudeE7': 3777321, 'longitudeE7': -122423125, 'timestampMs': '1542654', 'verticalAccuracy': 2 }, { 'accuracy': 17, 'activity': [{ 'activity': [{ 'confidence': 100, 'type': 'STILL' } ], 'timestampMs': '1542652' } ], 'altitude': -10, 'latitudeE7': 3777321, 'longitudeE7': -122423125, 'timestampMs': '1542654', 'verticalAccuracy': 2 }, { 'accuracy': 17, 'activity': [{ 'activity': [{ 'confidence': 100, 'type': 'STILL' } ], 'timestampMs': '1542652' } ], 'altitude': -10, 'latitudeE7': 3777321, 'longitudeE7': -122423125, 'timestampMs': '1542654', 'verticalAccuracy': 2 } ]
展平 2:
df = pd.DataFrame([flatten_json(x) for x in data]) accuracy activity_0_activity_0_confidence activity_0_activity_0_type activity_0_timestampMs altitude latitudeE7 longitudeE7 timestampMs verticalAccuracy 17 100 STILL 1542652 -10 3777321 -122423125 1542654 2 17 100 STILL 1542652 -10 3777321 -122423125 1542654 2 17 100 STILL 1542652 -10 3777321 -122423125 1542654 2
数据 3:
{ "1": { "VENUE": "JOEBURG", "COUNTRY": "HAE", "ITW": "XAD", "RACES": { "1": { "NO": 1, "TIME": "12:35" }, "2": { "NO": 2, "TIME": "13:10" }, "3": { "NO": 3, "TIME": "13:40" }, "4": { "NO": 4, "TIME": "14:10" }, "5": { "NO": 5, "TIME": "14:55" }, "6": { "NO": 6, "TIME": "15:30" }, "7": { "NO": 7, "TIME": "16:05" }, "8": { "NO": 8, "TIME": "16:40" } } }, "2": { "VENUE": "FOOBURG", "COUNTRY": "ABA", "ITW": "XAD", "RACES": { "1": { "NO": 1, "TIME": "12:35" }, "2": { "NO": 2, "TIME": "13:10" }, "3": { "NO": 3, "TIME": "13:40" }, "4": { "NO": 4, "TIME": "14:10" }, "5": { "NO": 5, "TIME": "14:55" }, "6": { "NO": 6, "TIME": "15:30" }, "7": { "NO": 7, "TIME": "16:05" }, "8": { "NO": 8, "TIME": "16:40" } } } }
展平 3:
df = pd.DataFrame([flatten_json(data[key]) for key in data.keys()]) VENUE COUNTRY ITW RACES_1_NO RACES_1_TIME RACES_2_NO RACES_2_TIME RACES_3_NO RACES_3_TIME RACES_4_NO RACES_4_TIME RACES_5_NO RACES_5_TIME RACES_6_NO RACES_6_TIME RACES_7_NO RACES_7_TIME RACES_8_NO RACES_8_TIME JOEBURG HAE XAD 1 12:35 2 13:10 3 13:40 4 14:10 5 14:55 6 15:30 7 16:05 8 16:40 FOOBURG ABA XAD 1 12:35 2 13:10 3 13:40 4 14:10 5 14:55 6 15:30 7 16:05 8 16:40
其他示例:
这篇关于如何使用 flatten_json 递归地展平嵌套的 JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- Load
- 加载
- In this case we will use
- 在这种情况下,我们将使用
推荐答案
How to flatten a JSON
or dict
是一个常见的问题,有很多答案.
How to flatten a JSON
or dict
is a common question, to which there are many answers.
查看全文