pandas :IndexingError:作为索引器提供的不可对齐的布尔系列 [英] Pandas: IndexingError: Unalignable boolean Series provided as indexer
问题描述
我正在尝试运行我认为简单的代码来消除所有带有NaN的列,但无法使其正常工作(axis = 1
在消除行时效果很好):
I'm trying to run what I think is simple code to eliminate any columns with all NaNs, but can't get this to work (axis = 1
works just fine when eliminating rows):
import pandas as pd
import numpy as np
df = pd.DataFrame({'a':[1,2,np.nan,np.nan], 'b':[4,np.nan,6,np.nan], 'c':[np.nan, 8,9,np.nan], 'd':[np.nan,np.nan,np.nan,np.nan]})
df = df[df.notnull().any(axis = 0)]
print df
完整错误:
raise IndexingError('Unalignable boolean Series provided as 'pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match
预期输出:
a b c
0 1.0 4.0 NaN
1 2.0 NaN 8.0
2 NaN 6.0 9.0
3 NaN NaN NaN
推荐答案
您需要 loc
,因为按列过滤:
You need loc
, because filter by columns:
print (df.notnull().any(axis = 0))
a True
b True
c True
d False
dtype: bool
df = df.loc[:, df.notnull().any(axis = 0)]
print (df)
a b c
0 1.0 4.0 NaN
1 2.0 NaN 8.0
2 NaN 6.0 9.0
3 NaN NaN NaN
或过滤列,然后按[]
进行选择:
Or filter columns and then select by []
:
print (df.columns[df.notnull().any(axis = 0)])
Index(['a', 'b', 'c'], dtype='object')
df = df[df.columns[df.notnull().any(axis = 0)]]
print (df)
a b c
0 1.0 4.0 NaN
1 2.0 NaN 8.0
2 NaN 6.0 9.0
3 NaN NaN NaN
或 dropna
how='all'
用于删除仅由NaN
填充的所有列:
Or dropna
with parameter how='all'
for remove all columns filled by NaN
s only:
print (df.dropna(axis=1, how='all'))
a b c
0 1.0 4.0 NaN
1 2.0 NaN 8.0
2 NaN 6.0 9.0
3 NaN NaN NaN
这篇关于 pandas :IndexingError:作为索引器提供的不可对齐的布尔系列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!