Python Pandas:同时在不同的列上进行均值和总和分组 [英] Python pandas: mean and sum groupby on different columns at the same time
问题描述
我有一个熊猫数据框,如下所示:
I have a pandas dataframe which looks like the following:
Name Missed Credit Grade
A 1 3 10
A 1 1 12
B 2 3 10
B 1 2 20
我想要的输出是:
Name Sum1 Sum2 Average
A 2 4 11
B 3 5 15
基本上获得列Credit
和Missed
的总和,并对Grade
求平均值.我现在正在做的是在Name
上进行两个groupby,然后求和和求平均值,最后合并两个输出数据帧,这似乎并不是最好的方法.我在SO上也发现了这一点,如果我只想在一列上工作,这是很有意义的:
Basically to get the sum of column Credit
and Missed
and to do average on Grade
. What I am doing right now is two groupby on Name
and then get sum and average and finally merge the two output dataframes which does not seem to be the best way of doing this. I have also found this on SO which makes sense if I want to work only on one column:
df.groupby('Name')['Credit'].agg(['sum','average'])
但是不确定如何为两根色谱柱做一个衬管吗?
But not sure how to do a one-liner for both columns?
推荐答案
You need agg
by dictionary
and then rename
columns names:
d = {'Missed':'Sum1', 'Credit':'Sum2','Grade':'Average'}
df=df.groupby('Name').agg({'Missed':'sum', 'Credit':'sum','Grade':'mean'}).rename(columns=d)
print (df)
Sum1 Sum2 Average
Name
A 2 4 11
B 3 5 15
如果还要从Name
创建列:
df = (df.groupby('Name', as_index=False)
.agg({'Missed':'sum', 'Credit':'sum','Grade':'mean'})
.rename(columns={'Missed':'Sum1', 'Credit':'Sum2','Grade':'Average'}))
print (df)
Name Sum1 Sum2 Average
0 A 2 4 11
1 B 3 5 15
这篇关于Python Pandas:同时在不同的列上进行均值和总和分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!