pandas MultiIndex(超过2级)DataFrame到嵌套Dict/JSON [英] Pandas MultiIndex (more than 2 levels) DataFrame to Nested Dict/JSON

查看:74
本文介绍了 pandas MultiIndex(超过2级)DataFrame到嵌套Dict/JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题类似于这个问题,但是我想要更进一步.是否可以将解决方案扩展到更多级别?多层数据框的 .to_dict()方法有一些有希望的选择,但是大多数方法都将返回由元组索引的条目(即(A,0,0):274.0 ),而不是将其嵌套在字典中.

This question is similar to this one, but I want to take it a step further. Is it possible to extend the solution to work with more levels? Multilevel dataframes' .to_dict() method has some promising options, but most of them will return entries that are indexed by tuples (i.e. (A, 0, 0): 274.0) rather than nesting them in dictionaries.

有关我要完成的工作的示例,请考虑以下多索引数据框:

For an example of what I'm looking to accomplish, consider this multiindex dataframe:

data = {0: {
        ('A', 0, 0): 274.0, 
        ('A', 0, 1): 19.0, 
        ('A', 1, 0): 67.0, 
        ('A', 1, 1): 12.0, 
        ('B', 0, 0): 83.0, 
        ('B', 0, 1): 45.0
    },
    1: {
        ('A', 0, 0): 254.0, 
        ('A', 0, 1): 11.0, 
        ('A', 1, 0): 58.0, 
        ('A', 1, 1): 11.0, 
        ('B', 0, 0): 76.0, 
        ('B', 0, 1): 56.0
    }   
}
df = pd.DataFrame(data).T
df.index = ['entry1', 'entry2']
df
# output:

         A                              B
         0              1               0
         0      1       0       1       0       1
entry1   274.0  19.0    67.0    12.0    83.0    45.0
entry2   254.0  11.0    58.0    11.0    76.0    56.0

您可以想象我们这里有很多记录,而不仅仅是两个,而且索引名称可以是更长的字符串.您如何将其转换为如下所示的嵌套字典(或直接转换为JSON):

You can imagine that we have many records here, not just two, and that the index names could be longer strings. How could you turn this into nested dictionaries (or directly to JSON) that look like this:

[
 {'entry1': {'A': {0: {0: 274.0, 1: 19.0}, 1: {0: 67.0, 1: 12.0}},
  'B': {0: {0: 83.0, 1: 45.0}}},
 'entry2': {'A': {0: {0: 254.0, 1: 11.0}, 1: {0: 58.0, 1: 11.0}},
  'B': {0: {0: 76.0, 1: 56.0}}}}
]

我认为一定程度的递归可能会有所帮助,例如

I'm thinking some amount of recursion could potentially be helpful, maybe something like this, but have so far been unsuccessful.

推荐答案

因此,您确实需要在这里做两件事:

So, you really need to do 2 things here:

  • df.to_dict()
  • 将其转换为嵌套字典.

df.to_dict(orient ='index')给您一本以索引为键的字典;看起来像这样:

df.to_dict(orient='index') gives you a dictionary with the index as keys; it looks like this:

>>> df.to_dict(orient='index')
{'entry1': {('A', 0, 0): 274.0,
  ('A', 0, 1): 19.0,
  ('A', 1, 0): 67.0,
  ('A', 1, 1): 12.0,
  ('B', 0, 0): 83.0,
  ('B', 0, 1): 45.0},
 'entry2': {('A', 0, 0): 254.0,
  ('A', 0, 1): 11.0,
  ('A', 1, 0): 58.0,
  ('A', 1, 1): 11.0,
  ('B', 0, 0): 76.0,
  ('B', 0, 1): 56.0}}

现在,您需要将此嵌套.这是Martijn Pieters的技巧 :

Now you need to nest this. Here's a trick from Martijn Pieters to do that:

def nest(d: dict) -> dict:
    result = {}
    for key, value in d.items():
        target = result
        for k in key[:-1]:  # traverse all keys but the last
            target = target.setdefault(k, {})
        target[key[-1]] = value
    return result

将所有内容放在一起:

def df_to_nested_dict(df: pd.DataFrame) -> dict:
    d = df.to_dict(orient='index')
    return {k: nest(v) for k, v in d.items()}

输出:

>>> df_to_nested_dict(df)
{'entry1': {'A': {0: {0: 274.0, 1: 19.0}, 1: {0: 67.0, 1: 12.0}},
  'B': {0: {0: 83.0, 1: 45.0}}},
 'entry2': {'A': {0: {0: 254.0, 1: 11.0}, 1: {0: 58.0, 1: 11.0}},
  'B': {0: {0: 76.0, 1: 56.0}}}}

这篇关于 pandas MultiIndex(超过2级)DataFrame到嵌套Dict/JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆