pandas -在缺少数据的地方插入行 [英] Pandas - insert rows where data is missing
本文介绍了 pandas -在缺少数据的地方插入行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据集,下面是一个示例:
I have a dataset, here is an example:
df = DataFrame({"Seconds_left":[5,10,15,25,30,35,5,10,15,30], "Team":["ATL","ATL","ATL","ATL","ATL","ATL","SAS","SAS","SAS","SAS"], "Fouls": [1,2,3,3,4,5,5,4,1,1]})
Fouls Seconds_left Team
0 1 5 ATL
1 2 10 ATL
2 3 15 ATL
3 3 25 ATL
4 4 30 ATL
5 5 35 ATL
6 5 5 SAS
7 4 10 SAS
8 1 15 SAS
9 1 30 SAS
现在,我想在Seconds_left列中的数据丢失的地方插入行:
Now I would like to insert rows where data in the Seconds_left column is missing:
Id Fouls Seconds_left Team
0 1 5 ATL
1 2 10 ATL
2 3 15 ATL
3 NaN 20 ATL
4 3 25 ATL
5 4 30 ATL
6 5 35 ATL
7 5 5 SAS
8 4 10 SAS
9 1 15 SAS
10 NaN 20 SAS
11 NaN 25 SAS
12 1 30 SAS
13 NaN 35 SAS
我已经尝试过使用重新索引等方法,但是显然它不起作用,因为存在重复项.
I tried already with reindexing etc. but obviously it does not function as there are duplicates.
有人知道如何解决这个问题吗?
Has somebody got any idea how to solve this?
谢谢!
推荐答案
创建MultiIndex并重新索引+ reset_index:
Create a MultiIndex and reindex + reset_index:
idx = pd.MultiIndex.from_product([df['Team'].unique(),
np.arange(5, df['Seconds_left'].max()+1, 5)],
names=['Team', 'Seconds_left'])
df.set_index(['Team', 'Seconds_left']).reindex(idx).reset_index()
Out:
Team Seconds_left Fouls
0 ATL 5 1.0
1 ATL 10 2.0
2 ATL 15 3.0
3 ATL 20 NaN
4 ATL 25 3.0
5 ATL 30 4.0
6 ATL 35 5.0
7 SAS 5 5.0
8 SAS 10 4.0
9 SAS 15 1.0
10 SAS 20 NaN
11 SAS 25 NaN
12 SAS 30 1.0
13 SAS 35 NaN
这篇关于 pandas -在缺少数据的地方插入行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文