如果连续日期不存在,如何基于"ent_id"在PANDAS中执行正向填充逻辑,然后执行正向填充? [英] How to perform forward fill logic in PANDAS based on 'ent_id' if it does not exist for successive date then perform forward fill?

查看:62
本文介绍了如果连续日期不存在,如何基于"ent_id"在PANDAS中执行正向填充逻辑,然后执行正向填充?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个数据框,

effective_date,ent_id,val
2020-02-03,101,aa
2020-02-03,102,ab
2020-02-03,103,ac
2020-02-03,105,ad

2020-02-04,107,ba
2020-02-04,103,bd
2020-02-04,105,bv
2020-02-04,106,bs
2020-02-04,109,be
2020-02-04,102,bn

2020-02-05,117,ca
2020-02-05,113,cd
2020-02-05,115,cv
2020-02-05,106,cs
2020-02-05,109,ce
2020-02-05,102,cn

,输出将类似于,即如果连续日期不存在ent_id,则将其向前填充例如自有效日期'2020-02-04'开始,我们没有ent_id为101,因此它将被填充到下一个日期,即2020-02-04,101,aa,并且类似地也用于其他日期

and the output would be like i.e. if the ent_id does not exist for successive date the forward fill it e.g. as on effective_date '2020-02-04' we don't have ent_id as 101 hence it is fill forwarded to next date i.e. 2020-02-04,101,aa and similarly for other date as well

effective_date,ent_id,val
2020-02-03,101,aa
2020-02-03,102,ab
2020-02-03,103,ac
2020-02-03,105,ad

2020-02-04,101,aa
2020-02-04,107,ba
2020-02-04,103,bd
2020-02-04,105,bv
2020-02-04,106,bs
2020-02-04,109,be
2020-02-04,102,bn

2020-02-05,101,aa
2020-02-05,107,ba
2020-02-05,103,bd
2020-02-05,105,bv
2020-02-05,117,ca
2020-02-05,113,cd
2020-02-05,115,cv
2020-02-05,106,cs
2020-02-05,109,ce
2020-02-05,102,cn

我的努力

df['effective_date'] = pd.to_datetime(df['effective_date'])

df1 = (df.set_index(['effective_date',df.groupby('effective_date').cumcount()])
         .unstack()
         .ffill()
         .stack()
         .reset_index(level=1, drop=True)
         .reset_index())

但没有提供预期的输出

推荐答案

您可以旋转,填充然后取出即可:

you can just pivot, ffill, then unstack:

(df.pivot(index='effective_date', columns='ent_id')
   .ffill().stack().reset_index()
)

输出:

   effective_date  ent_id val
0      2020-02-03     101  aa
1      2020-02-03     102  ab
2      2020-02-03     103  ac
3      2020-02-03     105  ad
4      2020-02-04     101  aa
5      2020-02-04     102  bn
6      2020-02-04     103  bd
7      2020-02-04     105  bv
8      2020-02-04     106  bs
9      2020-02-04     107  ba
10     2020-02-04     109  be
11     2020-02-05     101  aa
12     2020-02-05     102  cn
13     2020-02-05     103  bd
14     2020-02-05     105  bv
15     2020-02-05     106  cs
16     2020-02-05     107  ba
17     2020-02-05     109  ce
18     2020-02-05     113  cd
19     2020-02-05     115  cv
20     2020-02-05     117  ca

这篇关于如果连续日期不存在,如何基于"ent_id"在PANDAS中执行正向填充逻辑,然后执行正向填充?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆