嵌套字典到多指数数据框,其中字典键是列标签 [英] Nested dictionary to multiindex dataframe where dictionary keys are column labels
问题描述
dictionary = {'A':{'a':[ 1,2,3,4,5],
'b':[6,7,8,9,1]},
'B':{'a':[ 2,3,4,5,6],
'b':[7,8,9,1,2]}}
,我想要一个如下所示的数据框:
AB
abab
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2
有方便的方法吗?如果我尝试:
在[99]:
DataFrame(字典)
出[99]:
AB
a [1,2,3,4,5] [2,3,4,5,6]
b [6,7,8, 9,1] [7,8,9,1,2]
我得到一个数据框,元素是一个列表。我需要的是一个多索引,其中每个级别对应于嵌套的dict中的键和对应于列表中每个元素的行,如上所示。我想我可以工作一个非常粗暴的解决方案,但我希望可能会有一些更简单的东西。
熊猫想要MultiIndex值作为元组,而不是嵌套的dicts。最简单的事情就是将您的字典转换成正确的格式,然后再尝试传递给DataFrame:
>>> reform = {(outerKey,innerKey):innerKey的outerKey,innerDict的值,在innerKey的dictionary.iteritems()中),innerDict.iteritems()中的值
>>>改革
{('A','a'):[1,2,3,4,5],
('A','b'):[6,7,8,9 ,1],
('B','a'):[2,3,4,5,6],
('B','b'):[7,8,9 ,1,2]}
>>> pandas.DataFrame(改革)
AB
abab
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2
[5行x 4列]
Say I have a dictionary that looks like this:
dictionary = {'A' : {'a': [1,2,3,4,5],
'b': [6,7,8,9,1]},
'B' : {'a': [2,3,4,5,6],
'b': [7,8,9,1,2]}}
and I want a dataframe that looks something like this:
A B
a b a b
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2
Is there a convenient way to do this? If I try:
In [99]:
DataFrame(dictionary)
Out[99]:
A B
a [1, 2, 3, 4, 5] [2, 3, 4, 5, 6]
b [6, 7, 8, 9, 1] [7, 8, 9, 1, 2]
I get a dataframe where each element is a list. What I need is a multiindex where each level corresponds to the keys in the nested dict and the rows corresponding to each element in the list as shown above. I think I can work a very crude solution but I'm hoping there might be something a bit simpler.
Pandas wants the MultiIndex values as tuples, not nested dicts. The simplest thing is to convert your dictionary to the right format before trying to pass it to DataFrame:
>>> reform = {(outerKey, innerKey): values for outerKey, innerDict in dictionary.iteritems() for innerKey, values in innerDict.iteritems()}
>>> reform
{('A', 'a'): [1, 2, 3, 4, 5],
('A', 'b'): [6, 7, 8, 9, 1],
('B', 'a'): [2, 3, 4, 5, 6],
('B', 'b'): [7, 8, 9, 1, 2]}
>>> pandas.DataFrame(reform)
A B
a b a b
0 1 6 2 7
1 2 7 3 8
2 3 8 4 9
3 4 9 5 1
4 5 1 6 2
[5 rows x 4 columns]
这篇关于嵌套字典到多指数数据框,其中字典键是列标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!