使用2列的累计总和 [英] Cumulative Sum using 2 columns
问题描述
我正在尝试创建一个使用2列进行累加总和的列,请参见示例:@Faith Akici
I am trying to create a column that does a cumulative sum using 2 columns , please see example of what I am trying to do :@Faith Akici
index lodgement_year words sum cum_sum
0 2000 the 14 14
1 2000 australia 10 10
2 2000 word 12 12
3 2000 brand 8 8
4 2000 fresh 5 5
5 2001 the 8 22
6 2001 australia 3 13
7 2001 banana 1 1
8 2001 brand 7 15
9 2001 fresh 1 6
我使用了下面的代码,但是我的计算机不断崩溃,我不确定代码还是计算机.任何帮助将不胜感激:
I have used the code below , however my computer keep crashing , I am unsure if is the code or the computer. Any help will be greatly appreciated:
df_2['cumsum']= df_2.groupby('lodgement_year')['words'].transform(pd.Series.cumsum)
更新;我还使用了下面的代码,它起作用了,并说退出代码0.但是有一些警告.
Update ; I have also used the code below , it worked and said exit code 0 . However with some warnings.
df_2['cum_sum'] =df_2.groupby(['words'])['count'].cumsum()
推荐答案
伊恩,您快到了!
cumsum()
方法计算Pandas列的累积和.您正在寻找应用于分组的words
的那个.因此:
cumsum()
method calculates the cumulative sum of a Pandas column. You are looking for that applied to the grouped words
. Therefore:
In [303]: df_2['cumsum'] = df_2.groupby(['words'])['sum'].cumsum()
In [304]: df_2
Out[304]:
index lodgement_year words sum cum_sum cumsum
0 0 2000 the 14 14 14
1 1 2000 australia 10 10 10
2 2 2000 word 12 12 12
3 3 2000 brand 8 8 8
4 4 2000 fresh 5 5 5
5 5 2001 the 8 22 22
6 6 2001 australia 3 13 13
7 7 2001 banana 1 1 1
8 8 2001 brand 7 15 15
9 9 2001 fresh 1 6 6
如果在您更大的数据集上失败了,请发表评论,我们将致力于此版本的更准确的版本.
Please comment if this fails on your bigger data set, and we'll work on a possibly more accurate version of this.
这篇关于使用2列的累计总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!