pandas DataFrame:标准化一个JSON列并与其他列合并 [英] pandas DataFrame: normalize one JSON column and merge with other columns
本文介绍了pandas DataFrame:标准化一个JSON列并与其他列合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个pandas DataFrame,其中包含一列,其中包含多个JSON数据项作为字典列表.我想规范化JSON列并复制非JSON列:
I have a pandas DataFrame containing one column with multiple JSON data items as list of dicts. I want to normalize the JSON column and duplicate the non-JSON columns:
# creating dataframe
df_actions = pd.DataFrame(columns=['id', 'actions'])
rows = [[12,json.loads('[{"type": "a","value": "17"},{"type": "b","value": "19"}]')],
[15, json.loads('[{"type": "a","value": "1"},{"type": "b","value": "3"},{"type": "c","value": "5"}]')]]
df_actions.loc[0] = rows[0]
df_actions.loc[1] = rows[1]
>>>df_actions
id actions
0 12 [{'type': 'a', 'value': '17'}, {'type': 'b', '...
1 15 [{'type': 'a', 'value': '1'}, {'type': 'b', 'v...
我想要
>>>df_actions_parsed
id type value
12 a 17
12 b 19
15 a 1
15 b 3
15 c 5
我可以使用以下方式标准化JSON数据:
I can normalize JSON data using:
pd.concat([pd.DataFrame(json_normalize(x)) for x in df_actions['actions']],ignore_index=True)
但是我不知道如何将其重新连接到原始DataFrame的id列.
but I don't know how to join that back to the id column of the original DataFrame.
推荐答案
您可以使用 concat
与dict comprehension
和 join
更改为原始内容:
You can use concat
with dict comprehension
with pop
for extract column, remove second level and join
to original:
df1 = (pd.concat({i: pd.DataFrame(x) for i, x in df_actions.pop('actions').items()})
.reset_index(level=1, drop=True)
.join(df_actions)
.reset_index(drop=True))
与什么相同:
df1 = (pd.concat({i: json_normalize(x) for i, x in df_actions.pop('actions').items()})
.reset_index(level=1, drop=True)
.join(df_actions)
.reset_index(drop=True))
print (df1)
type value id
0 a 17 12
1 b 19 12
2 a 1 15
3 b 3 15
4 c 5 15
这篇关于pandas DataFrame:标准化一个JSON列并与其他列合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文