在 pandas 中按组填写缺少的日期 [英] Fill missing dates by group in pandas
本文介绍了在 pandas 中按组填写缺少的日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我需要按组填写缺少的日期.这是创建数据框的代码.我只想将填充列的日期向下添加到填充列的日期更改时,直到组名称"更改为止.
I need to fill the missing date down by group. Here is the code to create the data frame. i want to add the date of the fill column down only as far as the when the date of the fill column changes and only until the group 'name' changes.
data = {'tdate': [20080815,20080915,20081226,20090110,20090131,20080807,20080831,
20080918,20081023,20081114,20081207,20090117,20090203,20090219,20090305,20090318,20090501],
'name': ['A','A','A','A','A','B','B','B','B','B','B','B','B','B','B','B','B'],
'fill': [NaN,NaN,20080915,NaN,NaN,NaN,NaN,NaN,NaN,20081023,
NaN,NaN,NaN,NaN,20090219,NaN,NaN]}
df = pd.DataFrame(data, columns=['tdate', 'name', 'fill'])
df
当前数据帧
tdate name fill
0 20080815 A NaN
1 20080915 A NaN
2 20081226 A 20080915
3 20090110 A NaN
4 20090131 A NaN
5 20080807 B NaN
6 20080831 B NaN
7 20080918 B NaN
8 20081023 B NaN
9 20081114 B 20081023
10 20081207 B NaN
11 20090117 B NaN
12 20090203 B NaN
13 20090219 B NaN
14 20090305 B 20090219
15 20090318 B NaN
16 20090501 B NaN
所需的输出
tdate name fill
0 20080815 A NaN
1 20080915 A NaN
2 20081226 A 20080915
3 20090110 A 20080915
4 20090131 A 20080915
5 20080807 B NaN
6 20080831 B NaN
7 20080918 B NaN
8 20081023 B NaN
9 20081114 B NaN
10 20081207 B 20081023
11 20090117 B 20081023
12 20090203 B 20081023
13 20090219 B 20081023
14 20090305 B 20081023
15 20090318 B 20090219
16 20090501 B 20090219
这是我的代码
df.groupby(df["name"])["fill"].fill()
推荐答案
您非常接近,您只需前进 -fill,而不是仅仅填充:
You were pretty close, you just need to forward-fill rather than just filling:
df.groupby('name')["fill"].ffill()
Out[42]:
0 NaN
1 NaN
2 20080915
3 20080915
4 20080915
5 NaN
6 NaN
7 NaN
8 NaN
9 20081023
10 20081023
11 20081023
12 20081023
13 20081023
14 20090219
15 20090219
16 20090219
dtype: float64
或等效地:
df.groupby('name')["fill"].fillna(method='ffill')
这篇关于在 pandas 中按组填写缺少的日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文