从 pandas 数据框中删除NaN值并调整表的形状 [英] Remove NaN values from pandas dataframe and reshape table
本文介绍了从 pandas 数据框中删除NaN值并调整表的形状的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
- 给出一个数据框,其中各列中插入了
NaN
s,该数据框如何转换以除去所有NaN
- Given a dataframe with columns interspersed with
NaN
s, how can the dataframe be transformed to remove all theNaN
from the columns?
import pandas as pd
import numpy as np
# dataframe from list of lists
list_of_lists = [[ 4., 7., 1., np.nan],
[np.nan, np.nan, 3., 3.],
[ 4., 9., np.nan, np.nan],
[np.nan, np.nan, 7., 9.],
[np.nan, 2., np.nan, 2.],
[4., np.nan, np.nan, np.nan]]
df_from_lists = pd.DataFrame(list_of_lists, columns=['A', 'B', 'C', 'D'])
# dataframe from list of dicts
list_of_dicts = [{'A': 4.0, 'B': 7.0, 'C': 1.0},
{'C': 3.0, 'D': 3.0},
{'A': 4.0, 'B': 9.0},
{'C': 7.0, 'D': 9.0},
{'B': 2.0, 'D': 2.0},
{'A': 4.0}]
df_from_dicts = pd.DataFrame(list_of_dicts)
DataFrame的显示
A B C D
0 4.0 7.0 1.0 NaN
1 NaN NaN 3.0 3.0
2 4.0 9.0 NaN NaN
3 NaN NaN 7.0 9.0
4 NaN 2.0 NaN 2.0
5 4.0 NaN NaN NaN
预期输出
A B C D
0 4.0 7.0 1.0 3.0
1 4.0 9.0 3.0 9.0
2 4.0 2.0 7.0 2.0
推荐答案
您需要 apply
与 dropna
,只需创建 numpy数组
并重新分配 Series
来重置索引:
You need apply
with dropna
, only is necessary create numpy array
and reassign Series
for reset indices:
df.apply(lambda x: pd.Series(x.dropna().values))
示例:
df = pd.DataFrame({'B':[4,np.nan,4,np.nan,np.nan,4],
'C':[7,np.nan,9,np.nan,2,np.nan],
'D':[1,3,np.nan,7,np.nan,np.nan],
'E':[np.nan,3,np.nan,9,2,np.nan]})
print (df)
B C D E
0 4.0 7.0 1.0 NaN
1 NaN NaN 3.0 3.0
2 4.0 9.0 NaN NaN
3 NaN NaN 7.0 9.0
4 NaN 2.0 NaN 2.0
5 4.0 NaN NaN NaN
df1 = df.apply(lambda x: pd.Series(x.dropna().values))
print (df1)
B C D E
0 4.0 7.0 1.0 3.0
1 4.0 9.0 3.0 9.0
2 4.0 2.0 7.0 2.0
这篇关于从 pandas 数据框中删除NaN值并调整表的形状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文