在Python pandas 中将Forwardfill与计算(method ='ffill'* xyz)结合使用 [英] forwardfill combined with calculation (method='ffill' * xyz) in python pandas

查看:489
本文介绍了在Python pandas 中将Forwardfill与计算(method ='ffill'* xyz)结合使用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要用计算填充NaN空间,这取决于数据帧= df中的先前值.到目前为止,我的情况是这样:

I need to fill NaN spaces with a calculation, that depends on the previous values in the dataframe = df. What I have so far is this:

df = pd.DataFrame({'a': [None] * 6, 'b': [2, 3, 10, 3, 5, 8]})
df["c"] =np.NaN

df["c"][0] = 1
df["c"][2] = 3

i = 1
while i<10:
    df.c.fillna(df.c.shift(i)*df.b,inplace=True)
    i+1

不幸的是,使用while循环的解决方案不起作用,对于熊猫来说,这无疑是一个非常糟糕的解决方案.所以我要寻找的是一种

Unfortunately the solution with this while loop does not work and is certainly a very bad solution for pandas. So what I am looking for is kind of a

df.c.fillna(method='ffill'*df.b,inplace=True)

我知道这也不起作用,我只是认为这样可以使我更清楚地找到想要的内容.

I know that also doesn't work, I just think that makes it clearer what I am looking for.

在填充数据框之前,它看起来像这样:

Before filling the dataframe it looks like this:

   b   c
0  2   1
1  3 NaN
2 10   3
3  3 NaN
4  5 NaN
5  8 NaN

所需的结果应如下所示:

The desired outcome should look like this:

      b   c
0     2   1  # nothing filled in since data is set from df["c"][0] = 1
1     3   3  # fill in previous c * b = 1 * 3 = 3
2    10   3  # nothing filled in since data is set from df["c"][2] = 3
3     3   9  # fill in previous c * b = 3 * 3 = 9
4     5  45  # fill in previous c * b = 9 * 5 = 45
5     8 360  # fill in previous c * b = 45 * 8 = 360

所以基本上:如果没有可用的数据,则应在其中进行填充.

So basically: if there is no data availabe, it should be filled with a caculation.

推荐答案

我无法在一个循环中找到一种方法来执行此操作,这里的问题是您想要某种滚动应用,然后再查看问题是在apply完成之前,无法观察到前一行的更新,因此,例如下面的工作,因为我们运行了3次套用.这不是很棒的IMO:

I can't figure out a way to do this in a single loop, the problem here is that you want some kind of rolling apply that can then look at the previous row, the problem here is that the previous row update will not be observable until the apply finishes so for instance the following works because we in run the apply 3 times. This isn't great IMO:

In [103]:
def func(x):
    if pd.notnull(x['c']):
        return x['c']
    else:
        return df.iloc[x.name - 1]['c'] * x['b']
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df

Out[103]:
      a   b    c
0  None   2    1
1  None   3    3
2  None  10    3
3  None   3    9
4  None   5   45
5  None   8  360

这篇关于在Python pandas 中将Forwardfill与计算(method ='ffill'* xyz)结合使用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆