pandas :一列的总和基于另一列的价值 [英] Pandas: Cumulative sum of one column based on value of another

查看:99
本文介绍了 pandas :一列的总和基于另一列的价值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从熊猫数据框计算一些统计信息.看起来像这样:

I am trying to calculate some statistics from a pandas dataframe. It looks something like this:

id     value     conditional
1      10        0
2      20        0
3      30        1
1      15        1
3      5         0
1      10        1

因此,我需要计算从顶部到底部的每个idvalue的累积总和,但仅当conditional为1时.

So, I need to calculate the cumulative sum of the column value for each id from top to botom, but only when conditional is 1.

所以,这应该给我类似的东西

So, this should give me something like:

id     value     conditional   cumulative sum
1      10        0             0
2      20        0             0
3      30        1             30
1      15        1             15
3      5         0             30
1      10        1             25

因此,仅当第4行和第6行中的conditional=1和第1行值不计数时,才采用id=1的总和.如何在熊猫中做到这一点?

So, the sum of id=1 is taken only when conditional=1 in the 4th and 6th row and the 1st row value is not counted. How do I do this in pandas?

推荐答案

您可以创建一个序列,该序列是valueconditional的乘积,并对每个id组取其累加和:

You can create a Series that is the multiplication of value and conditional, and take the cumulative sum of it for each id group:

df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out: 
   id  value  conditional  cumsum
0   1     10            0       0
1   2     20            0       0
2   3     30            1      30
3   1     15            1      15
4   3      5            0      30
5   1     10            1      25

这篇关于 pandas :一列的总和基于另一列的价值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆