Python pandas.core.indexing.IndexingError:提供了不可对齐的布尔系列键 [英] Python pandas.core.indexing.IndexingError: Unalignable boolean Series key provided

查看:1120
本文介绍了Python pandas.core.indexing.IndexingError:提供了不可对齐的布尔系列键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

因此,我读取了一个具有29列的数据表,并在一个索引列中添加了索引(总共有30个).

So I read in a data table with 29 columns and i added in one index column (so 30 in total).

Data = pd.read_excel(os.path.join(BaseDir, 'test.xlsx'))
Data.reset_index(inplace=True)

然后,我想对数据进行子集处理,以仅包括其列名包含"ref"或"Ref"的列;我从另一个Stack帖子中获得了以下代码:

and then, i wanted to subset the data to only include the columns whose column name contains "ref" or "Ref"; I got below code from another Stack post:

col_keep = Data.ix[:, pd.Series(Data.columns.values).str.contains('ref', case=False)]

但是,我不断收到此错误:

However, I keep getting this error:

    print(len(Data.columns.values))
    30
    print(pd.Series(Data.columns.values).str.contains('ref', case=False))
    0     False
    1     False
    2     False
    3     False
    4     False
    5     False
    6     False
    7     False
    8     False
    9     False
    10    False
    11    False
    12    False
    13    False
    14    False
    15    False
    16    False
    17    False
    18    False
    19    False
    20    False
    21    False
    22    False
    23    False
    24     True
    25     True
    26     True
    27     True
    28    False
    29    False
    dtype: bool

Traceback (most recent call last):
  File "C:/Users/lala.py", line 26, in <module>
    col_keep = FedexData.ix[:, pd.Series(FedexData.columns.values).str.contains('ref', case=False)]
  File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexing.py", line 84, in __getitem__
    return self._getitem_tuple(key)
  File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexing.py", line 816, in _getitem_tuple
    retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
  File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexing.py", line 1014, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexing.py", line 1041, in _getitem_iterable
    key = check_bool_indexer(labels, key)
  File "C:\Users\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pandas\core\indexing.py", line 1817, in check_bool_indexer
    raise IndexingError('Unalignable boolean Series key provided')
pandas.core.indexing.IndexingError: Unalignable boolean Series key provided

因此布尔值正确,但是为什么不起作用?为什么错误不断弹出?

So the boolean values are correct, but why is it not working? why is the error keep popping up?

感谢任何帮助/提示!非常感谢.

Any help/hint is appreciated! Thank you so so much.

推荐答案

我可以通过这种方式重现类似的错误消息:

I can reproduce a similar error message this way:

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.randint(4, size=(10,4)), columns=list('ABCD'))
df.ix[:, pd.Series([True,False,True,False])]

提高(使用Pandas版本0.21.0.dev + 25.g50e95e0)

raises (using Pandas version 0.21.0.dev+25.g50e95e0)

pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match

发生此问题是因为Pandas试图对齐Series的索引 在使用Series布尔值屏蔽之前,使用DataFrame的列索引 价值观.由于df具有列标签'A', 'B', 'C', 'D',而Series具有 索引标签0123,Pandas抱怨标签是 无法对齐.

The problem occurs because Pandas is trying to align the index of the Series with the column index of the DataFrame before masking with the Series boolean values. Since df has column labels 'A', 'B', 'C', 'D' and the Series has index labels 0, 1, 2, 3, Pandas is complaining that the labels are unalignable.

您可能不希望任何索引对齐.因此,请改为传递NumPy布尔数组而不是Pandas Series:

You probably don't want any index alignment. So instead, pass a NumPy boolean array instead of a Pandas Series:

mask = pd.Series(Data.columns.values).str.contains('ref', case=False).values
col_keep = Data.loc[:, mask]

Series.values属性返回一个NumPy数组.并且由于在未来的Pandas版本中, DataFrame.ix将被删除,在这里使用Data.loc而不是Data.ix,因为我们要使用布尔索引.

The Series.values attribute returns a NumPy array. And since in future versions of Pandas, DataFrame.ix will be removed, use Data.loc instead of Data.ix here since we want boolean indexing.

这篇关于Python pandas.core.indexing.IndexingError:提供了不可对齐的布尔系列键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆