pandas groupby嵌套JSON [英] pandas groupby to nested json

查看:111
本文介绍了 pandas groupby嵌套JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经常使用pandas groupby生成堆叠表.但是然后我经常想将生成的嵌套关系输出到json.有什么方法可以从生成的堆叠表中提取嵌套的json文件吗?

I often use pandas groupby to generate stacked tables. But then I often want to output the resulting nested relations to json. Is there any way to extract a nested json filed from the stacked table it produces?

假设我有一个df,例如:

Let's say I have a df like:

year office candidate  amount
2010 mayor  joe smith  100.00
2010 mayor  jay gould   12.00
2010 govnr  pati mara  500.00
2010 govnr  jess rapp   50.00
2010 govnr  jess rapp   30.00

我可以做到:

grouped = df.groupby('year', 'office', 'candidate').sum()

print grouped
                       amount
year office candidate 
2010 mayor  joe smith   100
            jay gould    12
     govnr  pati mara   500
            jess rapp    80

美丽!当然,我真正想做的是通过命令沿着grouped.to_json嵌套嵌套的json.但是该功能不可用.任何解决方法?

Beautiful! Of course, what I'd real like to do is get nested json via a command along the lines of grouped.to_json. But that feature isn't available. Any workarounds?

所以,我真正想要的是这样的:

So, what I really want is something like:

{"2010": {"mayor": [
                    {"joe smith": 100},
                    {"jay gould": 12}
                   ]
         }, 
          {"govnr": [
                     {"pati mara":500}, 
                     {"jess rapp": 80}
                    ]
          }
}

推荐答案

我认为熊猫没有内置任何东西来创建嵌套的数据字典.下面是一些应该使用defaultdict

I don't think think there is anything built-in to pandas to create a nested dictionary of the data. Below is some code that should work in general for a series with a MultiIndex, using a defaultdict

嵌套代码遍历MultIndex的每个级别,将层添加到字典中,直到将最深层分配给Series值为止.

The nesting code iterates through each level of the MultIndex, adding layers to the dictionary until the deepest layer is assigned to the Series value.

In  [99]: from collections import defaultdict

In [100]: results = defaultdict(lambda: defaultdict(dict))

In [101]: for index, value in grouped.itertuples():
     ...:     for i, key in enumerate(index):
     ...:         if i == 0:
     ...:             nested = results[key]
     ...:         elif i == len(index) - 1:
     ...:             nested[key] = value
     ...:         else:
     ...:             nested = nested[key]

In [102]: results
Out[102]: defaultdict(<function <lambda> at 0x7ff17c76d1b8>, {2010: defaultdict(<type 'dict'>, {'govnr': {'pati mara': 500.0, 'jess rapp': 80.0}, 'mayor': {'joe smith': 100.0, 'jay gould': 12.0}})})

In [106]: print json.dumps(results, indent=4)
{
    "2010": {
        "govnr": {
            "pati mara": 500.0, 
            "jess rapp": 80.0
        }, 
        "mayor": {
            "joe smith": 100.0, 
            "jay gould": 12.0
        }
    }
}

这篇关于 pandas groupby嵌套JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆