如何在Pandas 0.20.1+中重命名多级组中的所有列 [英] How do you rename all columns in multi level group by in pandas 0.20.1+
问题描述
随着Pandas 0.20.1的发布,对groupby.agg()的功能有了新的弃用,并带有用于重命名的字典.
With the release of Pandas 0.20.1, there is a new deprecation of the functionality to groupby.agg() with a dictionary for renaming.
Deprecation documentation
我正在尝试找到更新代码的最佳方法,以解决此问题,但是我一直在努力利用当前的重命名功能.
I'm trying to find best way to update my code to account for this, however I'm struggling with how I've currently been utilizing this rename functionality.
进行聚合时,我经常为每个源列使用多个功能,并且我一直在使用此重命名功能来使用这些新的列名称获取单个级别的索引.
When I am doing an aggregate, I often have multiple functions for each source column, and I have been using this rename functionality to get to a single level index with these new column names.
示例:
df = pd.DataFrame({'A': [1, 1, 1, 2, 2],'B': range(5),'C': range(5)})
In [30]: df
Out[30]:
A B C
0 1 0 0
1 1 1 1
2 1 2 2
3 2 3 3
4 2 4 4
frame = df.groupby('A').agg({'B' : {'foo':'sum'}, 'C': {'bar' : 'min', 'bar2': 'max'}})
这将导致:
Out[33]:
B C
foo bar bar2
A
1 3 0 2
2 7 3 4
然后我通常会这样做:
frame = pd.DataFrame(frame).reset_index(col_level=1)
frame.columns = frame.columns.get_level_values(1)
frame
Out[42]:
A foo bar bar2
0 1 3 0 2
1 2 7 3 4
因此,我正在寻找获取单层索引但具有新的唯一列名称的结果数据框的好方法.其中多个列源自单个源列的聚合.最好的方法的任何建议,不胜感激.
So I'm looking for good ways to get a result dataframe that is single level index, but has new unique column names. Where multiple columns originated from an aggregate from a single source column. Any recommendations of best approach is greatly appreciated.
推荐答案
这在0.20.1
版本中可以很好地工作:
This works perfectly in 0.20.1
version:
d = {'sum':'foo','min':'bar','max':'bar2'}
frame = df.groupby('A').agg({'B' : ['sum'], 'C': ['min', 'max']}).rename(columns=d)
frame.columns = frame.columns.droplevel(0)
frame = frame.reset_index()
print (frame)
A foo bar bar2
0 1 3 0 2
1 2 7 3 4
如果有多个min
:
d = {'B_sum':'foo','C_min':'bar','C_max':'bar2'}
frame = df.groupby('A').agg({'B' : ['sum'], 'C': ['min', 'max']})
frame.columns = frame.columns.map('_'.join)
frame = frame.reset_index().rename(columns=d)
print (frame)
A foo bar bar2
0 1 3 0 2
1 2 7 3 4
这篇关于如何在Pandas 0.20.1+中重命名多级组中的所有列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!