Pandas 解析列中的 json 并扩展到数据框中的新行 [英] Pandas parse json in column and expand to new rows in dataframe

查看:50
本文介绍了Pandas 解析列中的 json 并扩展到数据框中的新行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含(记录格式)json 字符串的数据框,如下所示:

I have a dataframe containing (record formatted) json strings as follows:

In[9]: pd.DataFrame( {'col1': ['A','B'], 'col2': ['[{"t":"05:15","v":"20.0"}, {"t":"05:20","v":"25.0"}]', 
                                                '[{"t":"05:15","v":"10.0"}, {"t":"05:20","v":"15.0"}]']})

Out[9]: 
  col1                                               col2
0    A  [{"t":"05:15","v":"20.0"}, {"t":"05:20","v":"2...
1    B  [{"t":"05:15","v":"10.0"}, {"t":"05:20","v":"1...

我想提取 json 并为每条记录向数据帧添加一个新行:

I would like to extract the json and for each record add a new row to the dataframe:

    co1 t           v
0   A   05:15:00    20
1   A   05:20:00    25
2   B   05:15:00    10
3   B   05:20:00    15

我一直在试验以下代码:

I've been experimenting with the following code:

def json_to_df(x):
    df2 = pd.read_json(x.col2)
    return df2

df.apply(json_to_df, axis=1)

但结果数据帧被分配为元组,而不是创建新行.有什么建议吗?

but the resulting dataframes are assigned as tuples, rather than creating new rows. Any advice?

推荐答案

好的,从上面 hellpanderrr 的回答中得到一点启发,我想出了以下几点:

Ok, taking a little inspiration from hellpanderrr's answer above, I came up with the following:

In [92]:
pd.DataFrame( {'X': ['A','B'], 'Y': ['fdsfds','fdsfds'], 'json': ['[{"t":"05:15","v":"20.0"}, {"t":"05:20","v":"25.0"}]', 
                                                                       '[{"t":"05:15","v":"10.0"}, {"t":"05:20","v":"15.0"}]']},)
Out[92]:
    X   Y       json
0   A   fdsfds  [{"t":"05:15","v":"20.0"}, {"t":"05:20","v":"2...
1   B   fdsfds  [{"t":"05:15","v":"10.0"}, {"t":"05:20","v":"1...

In [93]:
dfs = []
def json_to_df(row, json_col):
    json_df = pd.read_json(row[json_col])
    dfs.append(json_df.assign(**row.drop(json_col)))
 
_.apply(json_to_df, axis=1, json_col='json')
pd.concat(dfs)

Out[93]:
    t       v   X   Y
0   05:15   20  A   fdsfds
1   05:20   25  A   fdsfds
0   05:15   10  B   fdsfds
1   05:20   15  B   fdsfds

这篇关于Pandas 解析列中的 json 并扩展到数据框中的新行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆