分组汇总 [英] rolling sum by group

查看:51
本文介绍了分组汇总的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑这个简单的例子

df = pd.DataFrame({'date' : [pd.to_datetime('2018-01-01'), 
                             pd.to_datetime('2018-01-01'), 
                             pd.to_datetime('2018-01-01'), 
                             pd.to_datetime('2018-01-01')],
                   'group' : ['a','a','b','b'],
                   'value' : [1,2,3,4],
                   'value_useless' : [2,2,2,2]})

df
Out[78]: 
        date group  value  value_useless
0 2018-01-01     a      1              2
1 2018-01-01     a      2              2
2 2018-01-01     b      3              2
3 2018-01-01     b      4              2

在这里,我想按组计算value的滚动总和.我尝试简单的

Here I want to compute the rolling sum of value by group. I try the simple

df['rolling_sum'] = df.groupby('group').value.rolling(2).sum()
TypeError: incompatible index of inserted column with frame index

带有apply的变体似乎也不起作用

A variant with apply does not seem to work either

df['rolling_sum'] = df.groupby('group').apply(lambda x: x.value.rolling(2).sum())
TypeError: incompatible index of inserted column with frame index

我在这里想念什么?谢谢!

What am I missing here? thanks!

推荐答案

groupby正在添加妨碍您前进的索引级别.

The groupby is adding an index level that is getting in your way.

rs = df.groupby('group').value.rolling(2).sum()
df.assign(rolling_sum=rs.reset_index(level=0, drop=True))

        date group  value  value_useless  rolling_sum
0 2018-01-01     a      1              2          NaN
1 2018-01-01     a      2              2          3.0
2 2018-01-01     b      3              2          NaN
3 2018-01-01     b      4              2          7.0

详细信息

rs

# Annoying Index Level
# |
# v
# group   
# a      0    NaN
#        1    3.0
# b      2    NaN
#        3    7.0
# Name: value, dtype: float64


或者,您可以使用pd.concat

df.assign(rolling_sum=pd.concat(s.rolling(2).sum() for _, s in df.groupby('group').value))

        date group  value  value_useless  rolling_sum
0 2018-01-01     a      1              2          NaN
1 2018-01-01     a      2              2          3.0
2 2018-01-01     b      3              2          NaN
3 2018-01-01     b      4              2          7.0

这篇关于分组汇总的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆