Python Pandas获取不包括当前行的累积总和(cumsum) [英] Python Pandas Get a Cumulative Sum (cumsum) which excludes the current row
问题描述
我正在尝试获取给定列的累积计数,该计数不包括数据帧中的当前行.
I am trying to get a cumulative count of a given column that excludes the current row in the dataframe.
我的代码如下所示.仅使用cumsum()的问题是它在计数中包括当前行.
My code is shown below. The problem with using cumsum() only is that it includes the current row in the count.
我希望df ['ExAnte良好年份计数']以ExAnte为基础计算总和-即.从计数中排除当前行.
I want df['ExAnte Good Year Count'] to calculate cumsum on an ExAnte basis - ie. excluding the current row from the count.
d = {
'Year':[2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008],
'Good Year':[1, 0, 1, 0, 0, 1, 1, 1, 0]
'Year Type':['X', 'Y', 'Z', 'Z', 'Z', 'X', 'Y', 'Z', 'Z']
}
df = pd.DataFrame(d, columns=['Year','Good Year'])
df['ExAnte Good Year Count'] = df['Good Year'].cumsum()
更新的查询: 我还想按年份类型计算好年"的总和.我已经尝试过...
UPDATED QUERY: I would also like to count the cumsum of 'Good Years', grouped by Year Type. I have tried...
'df['Good Year'].groupby(['Year Type']).shift().cumsum()'
...但是我收到一条错误消息,提示"KeyError:'年份类型"
...but I get an error which says 'KeyError:'Year Type'
推荐答案
df['Yourcol']=df.groupby('Year Type',sort=False)['Good Year'].apply(lambda x : x.shift().cumsum())
df
Out[283]:
Good Year Year Year Type Yourcol
0 1 2000 X NaN
1 0 2001 Y NaN
2 1 2002 Z NaN
3 0 2003 Z 1.0
4 0 2004 Z 1.0
5 1 2005 X 1.0
6 1 2006 Y 0.0
7 1 2007 Z 1.0
8 0 2008 Z 2.0
这篇关于Python Pandas获取不包括当前行的累积总和(cumsum)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!