Python Pandas获取不包括当前行的累积总和(cumsum) [英] Python Pandas Get a Cumulative Sum (cumsum) which excludes the current row

查看:304
本文介绍了Python Pandas获取不包括当前行的累积总和(cumsum)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取给定列的累积计数,该计数不包括数据帧中的当前行.

I am trying to get a cumulative count of a given column that excludes the current row in the dataframe.

我的代码如下所示.仅使用cumsum()的问题是它在计数中包括当前行.

My code is shown below. The problem with using cumsum() only is that it includes the current row in the count.

我希望df ['ExAnte良好年份计数']以ExAnte为基础计算总和-即.从计数中排除当前行.

I want df['ExAnte Good Year Count'] to calculate cumsum on an ExAnte basis - ie. excluding the current row from the count.

d = {
      'Year':[2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008], 
      'Good Year':[1, 0, 1, 0, 0, 1, 1, 1, 0]
      'Year Type':['X', 'Y', 'Z', 'Z', 'Z', 'X', 'Y', 'Z', 'Z']
    }

df = pd.DataFrame(d, columns=['Year','Good Year'])
df['ExAnte Good Year Count'] = df['Good Year'].cumsum()

更新的查询: 我还想按年份类型计算好年"的总和.我已经尝试过...

UPDATED QUERY: I would also like to count the cumsum of 'Good Years', grouped by Year Type. I have tried...

'df['Good Year'].groupby(['Year Type']).shift().cumsum()'

...但是我收到一条错误消息,提示"KeyError:'年份类型"

...but I get an error which says 'KeyError:'Year Type'

推荐答案

df['Yourcol']=df.groupby('Year Type',sort=False)['Good Year'].apply(lambda x : x.shift().cumsum())
df
Out[283]: 
   Good Year  Year Year Type  Yourcol
0          1  2000         X      NaN
1          0  2001         Y      NaN
2          1  2002         Z      NaN
3          0  2003         Z      1.0
4          0  2004         Z      1.0
5          1  2005         X      1.0
6          1  2006         Y      0.0
7          1  2007         Z      1.0
8          0  2008         Z      2.0

这篇关于Python Pandas获取不包括当前行的累积总和(cumsum)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆