在有条件的增量下对 pandas 数据框使用cumcount [英] Use cumcount on pandas dataframe with a conditional increment

查看:51
本文介绍了在有条件的增量下对 pandas 数据框使用cumcount的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑数据框

df = pd.DataFrame(
    [
        ['A', 1],
        ['A', 1],
        ['B', 1],
        ['B', 0],
        ['A', 0],
        ['A', 1],
        ['B', 1]
    ], columns = ['key', 'cond'])

我想找到每个 key 的累积(运行)计数(从1开始),我们仅在组中先前的值具有 cond == 1 时递增代码>.将其附加到上述数据框后,即可得到

I want to find a cumulative (running) count (starting at 1) for each key, where we only increment if the previous value in the group had cond == 1. When appended to the above dataframe this would give

df_result = pd.DataFrame(
    [
        ['A', 1, 1],
        ['A', 1, 2],
        ['B', 1, 1],
        ['B', 0, 2],
        ['A', 0, 3],
        ['A', 1, 3],
        ['B', 1, 2]
    ], columns = ['key', 'cond'])

请注意,基本上每个 key 组中最后一行的 cond 值都无效.

Note that essentially the cond values of the last rows in each key group have no effect.

只需做一个简单的 group cumcount

df.groupby('key').cumcount()

当然不考虑上一个元素的 cond 值.如何考虑到这一点?

of course doesn't account for the cond value of the previous element. How can I take this into account?

编辑

由于以下某些解决方案在某些极端情况下不起作用,因此我将提供更全面的数据框架进行测试.

As some of the solutions below don't work on some edge cases, I will give a more comprehensive dataframe for testing.

df = pd.DataFrame(
    [
        ['A', 0],
        ['A', 1],
        ['A', 1],
        ['B', 1],
        ['B', 0],
        ['A', 0],
        ['A', 1],
        ['B', 1],
        ['B', 0]
    ], columns = ['key', 'cond'])

在添加真实结果时应该给出的

which when appending the true result should give

df_result = pd.DataFrame(
    [
        ['A', 0, 1],
        ['A', 1, 1],
        ['A', 1, 2],
        ['B', 1, 1],
        ['B', 0, 2],
        ['A', 0, 3],
        ['A', 1, 3],
        ['B', 1, 2],
        ['B', 0, 3]
    ], columns = ['key', 'cond'])

推荐答案

使用 groupby

df
  key  cond  new
0   A     1    1
1   A     1    2
2   B     1    1
3   B     0    2
4   A     0    3
5   A     1    3
6   B     1    2

这篇关于在有条件的增量下对 pandas 数据框使用cumcount的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆