遍历数据框时参考上一行 [英] Reference previous row when iterating through dataframe

查看:93
本文介绍了遍历数据框时参考上一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在遍历数据框时是否有引用上一行的简单方法? 在下面的数据框中,我希望B列在A > 1时更改为1,并在A < -1时更改为-1时保持为1.

Is there a simple way to reference the previous row when iterating through a dataframe? In the following dataframe I would like column B to change to 1 when A > 1 and remain at 1 until A < -1, when it changes to -1.

In [11]: df
Out[11]:
                    A    B
2000-01-01  -0.182994    0
2000-01-02   1.290203    0
2000-01-03   0.245229    0
2000-01-08  -1.230742    0
2000-01-09   0.534939    0
2000-01-10   1.324027    0

这是我尝试做的,但是显然您不能只从索引中减去1:

This is what I've tried to do, but clearly you can't just subtract 1 from the index:

for idx,row in df.iterrows():
    if df["A"][idx]<-1:
        df["B"][idx] = -1
    elif df["A"][idx]>1:
        df["B"][idx] = 1
    else: 
        df["B"][idx] = df["B"][idx-1] 

我也尝试使用get_loc,但是完全迷路了,我确定我缺少一个非常简单的解决方案!

I also tried using get_loc but got completely lost, I'm sure I'm missing a very simple solution!

推荐答案

此处的类似问题:

Similar question here: Reference values in the previous row with map or apply .
My impression is that pandas should handle iterations and we shouldn't have to do it on our own... Therefore, I chose to use the DataFrame 'apply' method.

这是我在上面链接的其他问题上发布的答案...

Here is the same answer I posted on other question linked above...

您可以使用数据框的应用"功能,并利用未使用的"kwargs"参数来存储上一行.

You can use the dataframe 'apply' function and leverage the unused the 'kwargs' parameter to store the previous row.

import pandas as pd

df = pd.DataFrame({'a':[0,1,2], 'b':[0,10,20]})

new_col = 'c'

def apply_func_decorator(func):
    prev_row = {}
    def wrapper(curr_row, **kwargs):
        val = func(curr_row, prev_row)
        prev_row.update(curr_row)
        prev_row[new_col] = val
        return val
    return wrapper

@apply_func_decorator
def running_total(curr_row, prev_row):
    return curr_row['a'] + curr_row['b'] + prev_row.get('c', 0)

df[new_col] = df.apply(running_total, axis=1)

print(df)
# Output will be:
#    a   b   c
# 0  0   0   0
# 1  1  10  11
# 2  2  20  33

此示例使用装饰器将上一行存储在字典中,然后在Pandas在下一行调用它时将其传递给函数.

This example uses a decorator to store the previous row in a dictionary and then pass it to the function when Pandas calls it on the next row.

免责声明1:第一行的'prev_row'变量开始为空,因此在apply函数中使用它时,我必须提供一个默认值以避免'KeyError'.

Disclaimer 1: The 'prev_row' variable starts off empty for the first row so when using it in the apply function I had to supply a default value to avoid a 'KeyError'.

免责声明2:我可以肯定这会降低套用操作的速度,但是我没有做任何测试来弄清楚有多少.

Disclaimer 2: I am fairly certain this will be slower the apply operation but I did not do any tests to figure out how much.

这篇关于遍历数据框时参考上一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆