如果连续日期不存在,如何基于"ent_id"在PANDAS中执行正向填充逻辑,然后执行正向填充? [英] How to perform forward fill logic in PANDAS based on 'ent_id' if it does not exist for successive date then perform forward fill?
本文介绍了如果连续日期不存在,如何基于"ent_id"在PANDAS中执行正向填充逻辑,然后执行正向填充?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有一个数据框,
effective_date,ent_id,val
2020-02-03,101,aa
2020-02-03,102,ab
2020-02-03,103,ac
2020-02-03,105,ad
2020-02-04,107,ba
2020-02-04,103,bd
2020-02-04,105,bv
2020-02-04,106,bs
2020-02-04,109,be
2020-02-04,102,bn
2020-02-05,117,ca
2020-02-05,113,cd
2020-02-05,115,cv
2020-02-05,106,cs
2020-02-05,109,ce
2020-02-05,102,cn
,输出将类似于,即如果连续日期不存在ent_id,则将其向前填充例如自有效日期'2020-02-04'开始,我们没有ent_id为101,因此它将被填充到下一个日期,即2020-02-04,101,aa,并且类似地也用于其他日期
and the output would be like i.e. if the ent_id does not exist for successive date the forward fill it e.g. as on effective_date '2020-02-04' we don't have ent_id as 101 hence it is fill forwarded to next date i.e. 2020-02-04,101,aa and similarly for other date as well
effective_date,ent_id,val
2020-02-03,101,aa
2020-02-03,102,ab
2020-02-03,103,ac
2020-02-03,105,ad
2020-02-04,101,aa
2020-02-04,107,ba
2020-02-04,103,bd
2020-02-04,105,bv
2020-02-04,106,bs
2020-02-04,109,be
2020-02-04,102,bn
2020-02-05,101,aa
2020-02-05,107,ba
2020-02-05,103,bd
2020-02-05,105,bv
2020-02-05,117,ca
2020-02-05,113,cd
2020-02-05,115,cv
2020-02-05,106,cs
2020-02-05,109,ce
2020-02-05,102,cn
我的努力
df['effective_date'] = pd.to_datetime(df['effective_date'])
df1 = (df.set_index(['effective_date',df.groupby('effective_date').cumcount()])
.unstack()
.ffill()
.stack()
.reset_index(level=1, drop=True)
.reset_index())
但没有提供预期的输出
推荐答案
您可以旋转,填充然后取出即可:
you can just pivot, ffill, then unstack:
(df.pivot(index='effective_date', columns='ent_id')
.ffill().stack().reset_index()
)
输出:
effective_date ent_id val
0 2020-02-03 101 aa
1 2020-02-03 102 ab
2 2020-02-03 103 ac
3 2020-02-03 105 ad
4 2020-02-04 101 aa
5 2020-02-04 102 bn
6 2020-02-04 103 bd
7 2020-02-04 105 bv
8 2020-02-04 106 bs
9 2020-02-04 107 ba
10 2020-02-04 109 be
11 2020-02-05 101 aa
12 2020-02-05 102 cn
13 2020-02-05 103 bd
14 2020-02-05 105 bv
15 2020-02-05 106 cs
16 2020-02-05 107 ba
17 2020-02-05 109 ce
18 2020-02-05 113 cd
19 2020-02-05 115 cv
20 2020-02-05 117 ca
这篇关于如果连续日期不存在,如何基于"ent_id"在PANDAS中执行正向填充逻辑,然后执行正向填充?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文