pandas -通过groupby扩展均值 [英] Pandas - expanding mean with groupby

查看:75
本文介绍了 pandas -通过groupby扩展均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力扩大平均水平.我可以通过仅按特定值进行过滤来进行迭代和分组"时使其工作,但是这样做花费的时间太长.我觉得这对于groupby来说应该是一个简单的应用程序,但是当我这样做时,它只是对整个数据集进行扩展,而不是对grouby中的每个组都这样做.

I'm trying to get an expanding mean. I can get it to work when I iterate and "group" just by filtering by the specific values, but it takes way too long to do. I feel like this should be an easy application to do with a groupby, but when I do it, it just does the expanding mean to the entire dataset, as opposed to just doing it for each of the groups in grouby.

举个简单的例子:

我想接受这一点(在这种情况下,按玩家"和年"分组),并得到一个扩展的均值.

I want to take this (in this particular case, grouped by 'player' and 'year'), and get an expanding mean.

player  pos year    wk  pa  ra
a       qb  2001    1   10  0       
a       qb  2001    2   5   0
a       qb  2001    3   10  0
a       qb  2002    1   12  0
a       qb  2002    2   13  0
b       rb  2001    1   0   20
b       rb  2001    2   0   17
b       rb  2001    3   0   12
b       rb  2002    1   0   14
b       rb  2002    2   0   15

获得:

player  pos year    wk  pa  ra  avg_pa  avg_ra
a       qb  2001    1   10  0   10      0
a       qb  2001    2   5   0   7.5     0
a       qb  2001    3   10  0   8.3     0
a       qb  2002    1   12  0   12      0
a       qb  2002    2   13  0   12.5    0
b       rb  2001    1   0   20  0       20
b       rb  2001    2   0   17  0       18.5
b       rb  2001    3   0   12  0       16.3
b       rb  2002    1   0   14  0       14
b       rb  2002    2   0   15  0       14.5

不确定我要去哪里哪里

# Group by player and season - also put weeks in correct ascending order
grouped = calc_averages.groupby(['player','pos','seas']).apply(pd.DataFrame.sort_values, 'wk')


grouped['avg_pa'] = grouped['pa'].expanding().mean()

但这将为整个系列(而不是每个球员)提供扩展的均值.

But this will give an expanding mean for the entire set, not for each player, season.

推荐答案

尝试:

df.sort_values('wk').groupby(['player','pos','year'])['pa','ra'].expanding().mean()\
  .reset_index()

输出:

  player pos  year  level_3         pa         ra
0      a  qb  2001        0  10.000000   0.000000
1      a  qb  2001        1   7.500000   0.000000
2      a  qb  2001        2   8.333333   0.000000
3      a  qb  2002        3  12.000000   0.000000
4      a  qb  2002        4  12.500000   0.000000
5      b  rb  2001        5   0.000000  20.000000
6      b  rb  2001        6   0.000000  18.500000
7      b  rb  2001        7   0.000000  16.333333
8      b  rb  2002        8   0.000000  14.000000
9      b  rb  2002        9   0.000000  14.500000

这篇关于 pandas -通过groupby扩展均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆