嵌套字典到多指数数据框,其中字典键是列标签 [英] Nested dictionary to multiindex dataframe where dictionary keys are column labels

查看:411
本文介绍了嵌套字典到多指数数据框,其中字典键是列标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个这样的字典:

  dictionary = {'A':{'a':[ 1,2,3,4,5],
'b':[6,7,8,9,1]},

'B':{'a':[ 2,3,4,5,6],
'b':[7,8,9,1,2]}}

,我想要一个如下所示的数据框:

  AB 
abab
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2

有方便的方法吗?如果我尝试:

 在[99]:

DataFrame(字典)

出[99]:
AB
a [1,2,3,4,5] [2,3,4,5,6]
b [6,7,8, 9,1] [7,8,9,1,2]

我得到一个数据框,元素是一个列表。我需要的是一个多索引,其中每个级别对应于嵌套的dict中的键和对应于列表中每个元素的行,如上所示。我想我可以工作一个非常粗暴的解决方案,但我希望可能会有一些更简单的东西。

解决方案

熊猫想要MultiIndex值作为元组,而不是嵌套的dicts。最简单的事情就是将您的字典转换成正确的格式,然后再尝试传递给DataFrame:

 >>> reform = {(outerKey,innerKey):innerKey的outerKey,innerDict的值,在innerKey的dictionary.iteritems()中),innerDict.iteritems()中的值
>>>改革
{('A','a'):[1,2,3,4,5],
('A','b'):[6,7,8,9 ,1],
('B','a'):[2,3,4,5,6],
('B','b'):[7,8,9 ,1,2]}
>>> pandas.DataFrame(改革)
AB
abab
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2

[5行x 4列]


Say I have a dictionary that looks like this:

dictionary = {'A' : {'a': [1,2,3,4,5],
                     'b': [6,7,8,9,1]},

              'B' : {'a': [2,3,4,5,6],
                     'b': [7,8,9,1,2]}}

and I want a dataframe that looks something like this:

     A   B
     a b a b
  0  1 6 2 7
  1  2 7 3 8
  2  3 8 4 9
  3  4 9 5 1
  4  5 1 6 2

Is there a convenient way to do this? If I try:

In [99]:

DataFrame(dictionary)

Out[99]:
     A               B
a   [1, 2, 3, 4, 5] [2, 3, 4, 5, 6]
b   [6, 7, 8, 9, 1] [7, 8, 9, 1, 2]

I get a dataframe where each element is a list. What I need is a multiindex where each level corresponds to the keys in the nested dict and the rows corresponding to each element in the list as shown above. I think I can work a very crude solution but I'm hoping there might be something a bit simpler.

解决方案

Pandas wants the MultiIndex values as tuples, not nested dicts. The simplest thing is to convert your dictionary to the right format before trying to pass it to DataFrame:

>>> reform = {(outerKey, innerKey): values for outerKey, innerDict in dictionary.iteritems() for innerKey, values in innerDict.iteritems()}
>>> reform
{('A', 'a'): [1, 2, 3, 4, 5],
 ('A', 'b'): [6, 7, 8, 9, 1],
 ('B', 'a'): [2, 3, 4, 5, 6],
 ('B', 'b'): [7, 8, 9, 1, 2]}
>>> pandas.DataFrame(reform)
   A     B   
   a  b  a  b
0  1  6  2  7
1  2  7  3  8
2  3  8  4  9
3  4  9  5  1
4  5  1  6  2

[5 rows x 4 columns]

这篇关于嵌套字典到多指数数据框,其中字典键是列标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆