Python Pandas:获取列匹配特定值的行的索引 [英] Python Pandas: Get index of rows which column matches certain value

查看:880
本文介绍了Python Pandas:获取列匹配特定值的行的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个带有BoolCol列的DataFrame,我们想找到DataFrame的索引,其中BoolCol的值== True

Given a DataFrame with a column "BoolCol", we want to find the indexes of the DataFrame in which the values for "BoolCol" == True

我目前有迭代的方法,它完美地运作:

I currently have the iterating way to do it, which works perfectly:

for i in range(100,3000):
    if df.iloc[i]['BoolCol']== True:
         print i,df.iloc[i]['BoolCol']

但这不是正确的熊猫方式。
经过一些研究,我目前正在使用此代码:

But this is not the correct panda's way to do it. After some research, I am currently using this code:

df[df['BoolCol'] == True].index.tolist()

这个给我一个索引列表,但它们不匹配,当我检查它们时:

This one gives me a list of indexes, but they dont match, when I check them by doing:

df.iloc[i]['BoolCol']

结果实际上是假的!!

The result is actually False!!

这将是正确的熊猫方式这样做?

Which would be the correct Pandas way to do this?

推荐答案

df.iloc [i] 返回 ith df i 未引用索引标签, i 是基于0的索引。

df.iloc[i] returns the ith row of df. i does not refer to the index label, i is a 0-based index.

相反,属性 index 返回实际索引标签,而不是数字行索引:

In contrast, the attribute index returns actual index labels, not numeric row-indices:

df.index[df['BoolCol'] == True].tolist()

或等价,

df.index[df['BoolCol']].tolist()

通过使用DataFrame可以清楚地看到差异
异常指数:

You can see the difference quite clearly by playing with a DataFrame with an "unusual" index:

df = pd.DataFrame({'BoolCol': [True, False, False, True, True]},
       index=[10,20,30,40,50])

In [53]: df
Out[53]: 
   BoolCol
10    True
20   False
30   False
40    True
50    True

[5 rows x 1 columns]

In [54]: df.index[df['BoolCol']].tolist()
Out[54]: [10, 40, 50]






如果你想使用t他指数

In [56]: idx = df.index[df['BoolCol']]

In [57]: idx
Out[57]: Int64Index([10, 40, 50], dtype='int64')

然后您可以使用 loc 而不是 iloc选择行

then you can select the rows using loc instead of iloc:

In [58]: df.loc[idx]
Out[58]: 
   BoolCol
10    True
40    True
50    True

[3 rows x 1 columns]






请注意 loc 也可以接受布尔数组


Note that loc can also accept boolean arrays:

In [55]: df.loc[df['BoolCol']]
Out[55]: 
   BoolCol
10    True
40    True
50    True

[3 rows x 1 columns]






如果你有一个布尔数组, mask ,并且需要序数索引值,您可以使用 np.flatnonzero 计算它们:


If you have a boolean array, mask, and need ordinal index values, you can compute them using np.flatnonzero:

In [110]: np.flatnonzero(df['BoolCol'])
Out[112]: array([0, 3, 4])

使用 df.iloc 按顺序索引选择行:

Use df.iloc to select rows by ordinal index:

In [113]: df.iloc[np.flatnonzero(df['BoolCol'])]
Out[113]: 
   BoolCol
10    True
40    True
50    True

这篇关于Python Pandas:获取列匹配特定值的行的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆