DataFrame到嵌套的JSON [英] DataFrame to nested JSON

查看:90
本文介绍了DataFrame到嵌套的JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是JSON格式文件的新手.

我有一个Pandas DataFrame:

I have a Pandas DataFrame:

import pandas as pd

df = pd.DataFrame([["A", "2014/01/01", "2014/01/02", "A", -0.0061, "A"],
                   ["A", "2015/07/11", "2015/08/21", "A", 1.50, "A"],
                   ["C", "2016/01/01", "2016/01/05", "U", 2.75, "R"],
                   ["D", "2013/05/19", "2014/09/30", "Q", -100.0, "N"],
                   ["B", "2015/08/22", "2015/09/01", "T", 10.0, "R"]],
                   columns=["P", "Start", "End", "Category", "Value", "Group"]
                 ) 

看起来像这样

   P       Start         End Category     Value Group
0  A  2014/01/01  2014/01/02        A   -0.0061     A
1  A  2015/07/11  2015/08/21        A    1.5000     A
2  C  2016/01/01  2016/01/05        U    2.7500     R
3  D  2013/05/19  2014/09/30        Q -100.0000     N
4  B  2015/08/22  2015/09/01        T   10.0000     R

我知道我可以通过以下方式将其转换为JSON

I know that I could convert this to JSON via:

df.to_json("output.json")

但是我需要将其转换为这样的嵌套JSON格式:

But I need to convert it to a nested JSON format like this:

{
  "group_list": [
    {
      "category_list": [
        {
          "category": "A",
          "p_list": [
            {
              "p": "A",
              "date_list": [
                {
                  "start": "2014/01/01",
                  "end": "2014/01/02",
                  "value": "-0.0061"
                }
              ]
            },
            {
              "p": "A",
              "date_list": [
                {
                  "start": "2015/07/11",
                  "end": "2015/08/21",
                  "value": "1.5000"
                }
              ]
            }
          ]
        }
      ],
      "group": "A"
    },
    {
      "category_list": [
        {
          "category": "U",
          "p_list": [
            {
              "p": "C",
              "date_list": [
                {
                  "start": "2016/01/01",
                  "end": "2016/01/05",
                  "value": "2.7500"
                }
              ]
            }
          ]
        },
        {
          "category": "T",
          "p_list": [
            {
              "p": "B",
              "date_list": [
                {
                  "start": "2015/08/22",
                  "end": "2015/09/01",
                  "value": "10.000"
                }
              ]
            }
          ]
        }
      ],
      "group": "R"
    },
    {
      "category_list": [
        {
          "category": "Q",
          "p_list": [
            {
              "p": "D",
              "date_list": [
                {
                  "start": "2013/05/19",
                  "end": "2014/09/30",
                  "value": "-100.0000"
                }
              ]
            }
          ]
        }
      ],
      "group": "N"
    }
  ]
}

我已经考虑过使用Pandas的groupby功能,但是我还不太清楚如何将其转换为最终的JSON格式.本质上,嵌套从将具有相同组"和类别"列的行分组在一起开始.之后,只需列出行即可.我可以用嵌套的for循环编写一些代码,但我希望有一种更有效的方法来完成此任务.

I've considered using Pandas' groupby functionality but I can't quite figure out how I could then get it into the final JSON format. Essentially, the nesting begins with grouping together rows with the same "group" and "category" columns. Afterwards, it is a matter of listing out the rows. I could write some code with nested for-loops but I'm hoping that there is a more efficient way to accomplish this.

更新

我还可以通过以下方式操纵我的DataFrame:

I can also manipulate my DataFrame via:

df2 = df.set_index(['Group', 'Category', 'P']).stack()

Group  Category  P       
    A      A         A  Start    2014/01/01
                        End      2014/01/02
                        Value       -0.0061
                        Start    2015/07/11
                        End      2015/08/21
                        Value           1.5
    R      U         C  Start    2016/01/01
                        End      2016/01/05
                        Value          2.75
    N      Q         D  Start    2013/05/19
                        End      2014/09/30
                        Value          -100
    R      T         B  Start    2015/08/22
                        End      2015/09/01
                        Value            10

接近我需要的位置,但我认为在这种情况下不能打电话给df2.to_json().

which is close to where I need to be but I don't think one could call df2.to_json() in this case.

推荐答案

下面的嵌套循环应该可以使您更加接近:

The below nested loop should get you pretty close:

import json
from json import dumps

json_dict = {}
json_dict['group_list'] = []
for grp, grp_data in df.groupby('Group'):
    grp_dict = {}
    grp_dict['group'] = grp
    for cat, cat_data in grp_data.groupby('Category'):
        grp_dict['category_list'] = []
        cat_dict = {}
        cat_dict['category'] = cat
        cat_dict['p_list'] = []
        for p, p_data in cat_data.groupby('P'):
            p_data = p_data.drop(['Category', 'Group'], axis=1).set_index('P')
            for d in p_data.to_dict(orient='records'):
                cat_dict['p_list'].append({'p': p, 'date_list': [d]})
        grp_dict['category_list'].append(cat_dict)
    json_dict['group_list'].append(grp_dict)
json_out = dumps(json_dict)
parsed = json.loads(json_out)

导致:

json.dumps(parsed, indent=4, sort_keys=True)

{
    "group_list": [
        {
            "category_list": [
                {
                    "category": "A",
                    "p_list": [
                        {
                            "date_list": [
                                {
                                    "End": "2014/01/02",
                                    "Start": "2014/01/01",
                                    "Value": -0.0061
                                }
                            ],
                            "p": "A"
                        },
                        {
                            "date_list": [
                                {
                                    "End": "2015/08/21",
                                    "Start": "2015/07/11",
                                    "Value": 1.5
                                }
                            ],
                            "p": "A"
                        }
                    ]
                }
            ],
            "group": "A"
        },
        {
            "category_list": [
                {
                    "category": "Q",
                    "p_list": [
                        {
                            "date_list": [
                                {
                                    "End": "2014/09/30",
                                    "Start": "2013/05/19",
                                    "Value": -100.0
                                }
                            ],
                            "p": "D"
                        }
                    ]
                }
            ],
            "group": "N"
        },
        {
            "category_list": [
                {
                    "category": "U",
                    "p_list": [
                        {
                            "date_list": [
                                {
                                    "End": "2016/01/05",
                                    "Start": "2016/01/01",
                                    "Value": 2.75
                                }
                            ],
                            "p": "C"
                        }
                    ]
                }
            ],
            "group": "R"
        }
    ]
}

这篇关于DataFrame到嵌套的JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆