条件下的累计计数重置 [英] Cumulative count reset on condition

查看：71 发布时间：2021/6/13 20:33:44 python pandas

本文介绍了条件下的累计计数重置的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个与此类似的数据框:

I have a dataframe similar to this:

df = pd.DataFrame({'col1': ['a', 'a', 'a', 'a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c', 'c'],
                 'col2': [1, 1, 1, 1, 2, 2, 1, 1, 2, 1, 1, 2, 2],
                 'col3': [1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 0],
                 'desired': [0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1]})

我想在 col3 上应用滚动总和，当 col1 或 col2 发生变化时，或者当 col1 的前一个值发生变化时，它会重置code>col3 为零.

I want to apply a rolling sum on col3 which resets when either of col1 or col2 changes, or when the previous value of col3 was zero.

请注意，计数偏移了 1 个单元格.这意味着新 (col1, col2) 组合的期望值将始终为零.

Note that the count is offset by 1 cell. This means the desired value for a new (col1, col2) combination will always be zero.

下面的代码演示了所需的逻辑.但是，在下面的数据集上花费了将近 4 分钟.

The code below demonstrates the required logic. However, it takes nearly 4 minutes on the dataset below.

des = []
count = 0
for i in range(1, len(df)):
    des.append(count)
    if (df.iloc[i-1].col1 == df.iloc[i].col1) & \
       (df.iloc[i-1].col2 == df.iloc[i].col2) & \
       (df.iloc[i-1].col3 == 1):
    
        count += 1
    else:
        count = 0
    
des.append(0)

df['desired'] = des

要测试的更大数据集:https://www.dropbox.com/s/hbafcq6hdkh4r9r/test.csv?dl=0

推荐答案

使用 groupby 和 shift 先算连续的1:

a = df.groupby(['col1','col2'])['col3'].shift().fillna(0).eq(1)
b = a.cumsum()

df['desired'] = b-b.where(~a).ffill().fillna(0).astype(int)

print (df.head(20))
      col1  col2  col3  desired
0   100055     1     1        0
1   100055     1     0        1
2   100055     1     0        0
3   100055     1     0        0
4   100055     1     0        0
5   100055     1     0        0
6   100055     1     0        0
7   100055     1     0        0
8   100055     1     0        0
9   100055     1     0        0
10  100055     1     1        0
11  100055     1     1        1
12  100055     1     0        2
13  100055     1     1        0
14  100055     1     1        1
15  100055     1     0        2
16  100055     1     0        0
17  100055     1     1        0
18  100055     1     0        1
19  100055     1     1        0

这篇关于条件下的累计计数重置的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

条件下的累计计数重置 [英] Cumulative count reset on condition

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

条件下的累计计数重置 [英] Cumulative count reset on condition

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭