缺失数据,在 Pandas 中插入行并用 NAN 填充 [英] Missing data, insert rows in Pandas and fill with NAN

查看:56
本文介绍了缺失数据,在 Pandas 中插入行并用 NAN 填充的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Python 和 Pandas 的新手,所以可能有一个我看不到的简单解决方案.

I'm new to Python and Pandas so there might be a simple solution which I don't see.

我有一些不连续的数据集,看起来像这样:

I have a number of discontinuous datasets which look like this:

ind A    B  C  
0   0.0  1  3  
1   0.5  4  2  
2   1.0  6  1  
3   3.5  2  0  
4   4.0  4  5  
5   4.5  3  3  

我现在正在寻找一种解决方案来获得以下内容:

I now look for a solution to get the following:

ind A    B  C  
0   0.0  1  3  
1   0.5  4  2  
2   1.0  6  1  
3   1.5  NAN NAN  
4   2.0  NAN NAN  
5   2.5  NAN NAN  
6   3.0  NAN NAN  
7   3.5  2  0  
8   4.0  4  5  
9   4.5  3  3  

问题是,A 中的差距在位置和长度上因数据集而异...

The problem is,that the gap in A varies from dataset to dataset in position and length...

推荐答案

set_indexreset_index 是你的朋友.

df = DataFrame({"A":[0,0.5,1.0,3.5,4.0,4.5], "B":[1,4,6,2,4,3], "C":[3,2,1,0,5,3]})

首先将 A 列移到索引处:

First move column A to the index:

In [64]: df.set_index("A")
Out[64]: 
     B  C
 A        
0.0  1  3
0.5  4  2
1.0  6  1
3.5  2  0
4.0  4  5
4.5  3  3

然后用新的索引重新索引,这里缺失的数据用nans填充.我们使用 Index 对象,因为我们可以命名它;这将在下一步中使用.

Then reindex with a new index, here the missing data is filled in with nans. We use the Index object since we can name it; this will be used in the next step.

In [66]: new_index = Index(arange(0,5,0.5), name="A")
In [67]: df.set_index("A").reindex(new_index)
Out[67]: 
      B   C
0.0   1   3
0.5   4   2
1.0   6   1
1.5 NaN NaN
2.0 NaN NaN
2.5 NaN NaN
3.0 NaN NaN
3.5   2   0
4.0   4   5
4.5   3   3

最后使用 reset_index 将索引移回列.由于我们为索引命名,所以一切都神奇地工作:

Finally move the index back to the columns with reset_index. Since we named the index, it all works magically:

In [69]: df.set_index("A").reindex(new_index).reset_index()
Out[69]: 
       A   B   C
0    0.0   1   3
1    0.5   4   2
2    1.0   6   1
3    1.5 NaN NaN
4    2.0 NaN NaN
5    2.5 NaN NaN
6    3.0 NaN NaN
7    3.5   2   0
8    4.0   4   5
9    4.5   3   3

这篇关于缺失数据,在 Pandas 中插入行并用 NAN 填充的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆