将函数应用于分组的DataFrame后,Pandas sort_index给出奇怪的结果 [英] Pandas sort_index gives strange result after applying function to grouped DataFrame
问题描述
基本设置:
我在行和列上都有一个DataFrame
和一个MultiIndex
.列索引的第二级具有float
s值.
I have a DataFrame
with a MultiIndex
on both the rows and the columns. The second level of the column index has float
s for values.
我想执行groupby
操作(按行索引的第一级分组).该操作将为每个组添加几列(也将float
作为其标签),然后返回该组.
I want to perform a groupby
operation (grouping by the first level of the row index). The operation will add a few columns (also with float
s as their labels) to each group and then return the group.
当我从groupby
操作中获得结果时,似乎无法正确地对列进行排序.
When I get the result back from my groupby
operation, I can't seem to get the columns to sort properly.
工作示例.首先,进行设置:
Working example. First, set things up:
import pandas as pd
import numpy as np
np.random.seed(0)
col_level_1 = ['red', 'blue']
col_level_2 = [1., 2., 3., 4.]
row_level_1 = ['a', 'b']
row_level_2 = ['one', 'two']
col_idx = pd.MultiIndex.from_product([col_level_1, col_level_2], names=['color', 'numeral'])
row_idx = pd.MultiIndex.from_product([row_level_1, row_level_2], names=['letter', 'number'])
df = pd.DataFrame(np.random.randn(len(row_idx), len(col_idx)), index=row_idx, columns=col_idx)
在df
中给出此DataFrame
:
然后定义我的群组操作并应用它:
Then define my group operation and apply it:
def mygrpfun(group):
for f in [1.5, 2.5, 3.5]:
group[('red', f)] = 'hello'
group[('blue', f)] = 'world'
return group
result = df.groupby(level='letter').apply(mygrpfun).sort_index(axis=1)
显示result
给出:
这是怎么回事?为什么列索引的第二级不按升序显示?
What's going on here? Why doesn't the 2nd level of the column index display in ascending order?
就上下文而言:
pd.__version__
Out[28]:
'0.14.0'
In [29]:
np.__version__
Out[29]:
'1.8.1'
非常感谢任何帮助.
推荐答案
返回的结果与预期的一样.您添加了列.无法保证对这些列强加了顺序.
The returned result looks as expected. You added columns. There was no guarantee that order imposed on those columns.
您可以重新订购:
result = result[sorted(result.columns)]
这篇关于将函数应用于分组的DataFrame后,Pandas sort_index给出奇怪的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!