Python Pandas:多列的布尔索引 [英] Python Pandas: Boolean indexing on multiple columns
问题描述
尽管至少有两个 good 教程如何索引一个DataFrame在Python的 pandas
库,我仍然无法找出一个优雅的方式 SELECT
在多个列上。 p>
despite there being at least two good tutorials on how to index a DataFrame in Python's pandas
library, I still can't work out an elegant way of SELECT
ing on more than one column.
>>> d = pd.DataFrame({'x':[1, 2, 3, 4, 5], 'y':[4, 5, 6, 7, 8]})
>>> d
x y
0 1 4
1 2 5
2 3 6
3 4 7
4 5 8
>>> d[d['x']>2] # This works fine
x y
2 3 6
3 4 7
4 5 8
>>> d[d['x']>2 & d['y']>7] # I had expected this to work, but it doesn't
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
我发现(我认为是)像这样一个相当不太实际的方式,像这样
I have found (what I think is) a rather inelegant way of doing it, like this
>>> d[d['x']>2][d['y']>7]
但是它不漂亮,它的可读性相当低(我认为)。
But it's not pretty, and it scores fairly low for readability (I think).
有更好的Python脚本吗?
Is there a better, more Python-tastic way?
推荐答案
这是一个优先运算符问题。
It is a precedence operator issue.
您应该添加额外的括号,使您的多条件测试工作:
You should add extra parenthesis to make your multi condition test working:
d[(d['x']>2) & (d['y']>7)]
您提到的教程的此部分显示了具有多个布尔条件的示例,并使用括号。
This section of the tutorial you mentioned shows an example with several boolean conditions and the parenthesis are used.
这篇关于Python Pandas:多列的布尔索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!