pandas dataframe where子句,带点和括号的列选择 [英] pandas dataframe where clause with dot versus brackets column selection

查看:136
本文介绍了pandas dataframe where子句,带点和括号的列选择的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有字符串类型(对象)列的常规DataFrame.当我尝试使用WHERE子句的等效项对列进行过滤时,当我使用点表示法时会得到KeyError.用方括号表示时,一切都很好.

I have a regular DataFrame with a string type (object) column. When I try to filter on the column using the equivalent of a WHERE clause, I get a KeyError when I use the dot notation. When in bracket notation, all is well.

我指的是这些说明:

df[df.colA == 'blah']
df[df['colA'] == 'blah']

第一个给与

KeyError:错误

KeyError: False

未发布示例,因为我无法在为说明目的而构建的定制DataFrame上重现该问题:当我这样做时,两种表示法都会产生相同的结果.

Not posting an example as I cannot reproduce the issue on a bespoke DataFrame built for the purpose of illustration: when I do, both notations yield the same result.

然后询问两者是否存在差异以及原因.

Asking then if there is a difference in the two and why.

推荐答案

点符号只是访问内容和标准括号的方便快捷方式.值得注意的是,当列名称类似于sum之类的已经是DataFrame方法的名称时,它们将不起作用.我敢打赌,您的实际示例中的列名正遇到该问题,因此它在方括号选择中可以正常工作,但在其他方面测试方法是否等于'blah'.

The dot notation is just a convenient shortcut for accessing things vs. the standard brackets. Notably, they don't work when the column name is something like sum that is already a DataFrame method. My bet would be that the column name in your real example is running into that issue, and so it works fine with the bracket selection but is otherwise testing whether a method is equal to 'blah'.

下面的简单示例:

In [67]: df = pd.DataFrame(np.arange(10).reshape(5,2), columns=["number", "sum"])

In [68]: df
Out[68]:
   number  sum
0       0    1
1       2    3
2       4    5
3       6    7
4       8    9

In [69]: df.number == 0
Out[69]:
0     True
1    False
2    False
3    False
4    False
Name: number, dtype: bool

In [70]: df.sum == 0
Out[70]: False

In [71]: df['sum'] == 0
Out[71]:
0    False
1    False
2    False
3    False
4    False
Name: sum, dtype: bool

这篇关于pandas dataframe where子句,带点和括号的列选择的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆