pandas 内部apply()函数的计数 [英] Counting within Pandas apply() function

查看：90 发布时间：2020/5/24 4:06:01 python pandas

本文介绍了 pandas 内部apply()函数的计数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试遍历DataFrame，当值更改时，增加一个计数器，然后设置一个等于该值的新列.我可以使用全局计数器使它正常工作，就像这样:

I'm trying to iterate through a DataFrame and when a value changes, increment a counter, then set a new column equal to that value. I'm able to get this to work using a global counter, like so:

def change_ind(row):
    global prev_row
    global k

    if row['rep'] != prev_row:
        k = k+1
        prev_row = row['rep']
    return k

但是当我尝试将参数传递给apply函数时，如下所示，它不再起作用.好像它在每次对新行进行操作时都在重置k的值prev_row.有没有一种方法可以将参数传递给函数并获得所需的结果?还是完全可以做到这一点的更好方法?

But when I try to pass arguments to the apply function, as below, it no longer works. It seems like it is resetting the values of k, prev_row each time it operates on a new row. Is there a way to pass arguments to the function and get the result I'm looking for? Or a better way to do this altogether?

def change_ind(row, k, prev_row):    
    if row != prev_row:
        k = k+1
        prev_row = row
    return k

推荐答案

您可以使用shift和cumsum实现相同的操作，这比循环要快得多:

You can achieve the same thing using shift and cumsum this will be significantly faster than looping:

In [107]:
df = pd.DataFrame({'rep':[0,1,1,1,2,3,2,3,4,5,1]})
df

Out[107]:
    rep
0     0
1     1
2     1
3     1
4     2
5     3
6     2
7     3
8     4
9     5
10    1

In [108]:    
df['rep_f'] = (df['rep']!=df['rep'].shift()).cumsum()-1
df

Out[108]:
    rep  rep_f
0     0      0
1     1      1
2     1      1
3     1      1
4     2      2
5     3      3
6     2      4
7     3      5
8     4      6
9     5      7
10    1      8

这篇关于 pandas 内部apply()函数的计数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas 内部apply()函数的计数 [英] Counting within Pandas apply() function

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas 内部apply()函数的计数 [英] Counting within Pandas apply() function

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭