如何在Pandas DataFrame where子句中使用特定列的值? [英] How do I use a specific column's value in a Pandas DataFrame where clause?

查看:548
本文介绍了如何在Pandas DataFrame where子句中使用特定列的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当特定的列也满足单独的条件时,我试图选择满足特定条件的pandas DataFrame中的所有单元格.

I'm trying to select all cells in a pandas DataFrame that meet a certain criteria when a specific column also meets a separate criteria.

给出以下数据框:

      A    B    C    D
1/1   0    1    0    1
1/2   2    1    1    1
1/3   3    0    1    0 
1/4   1    0    1    2
1/5   1    0    1    1
1/6   2    0    2    1
1/7   3    5    2    3

D也是> 1时,我想以某种方式选择列大于其先前值的数据.

I would like to somehow select the data where a column is greater than its previous value, when D is also > 1. The syntax I'm trying to use currently is:

matches = df[(df > df.shift(1)) & (df.D > 1)]

但是,当我这样做时,会出现以下错误:

However, when i do this, I receive the following error:

TypeError:无法操作[array([nan,nan,nan,nan], dtype = object)]具有块值[无法广播操作数 连同形状(2016)(4)]

TypeError: Could not operate [array([nan, nan, nan, nan], dtype=object)] with block values [operands could not be broadcast together with shapes (2016) (4) ]

注意:该错误是我实际代码的直接复制和过去,因此该错误的描述和形状不会与我的示例DataFrame直接相关.

Note: the error is a direct copy and past from my actual code, so the description and the shape in the error would not correlate directly to my example DataFrame.

我知道df.D > 1引起了问题,直接将列与D进行比较是有效的(例如,df > df.D).尝试将D与值1进行比较时,我的语法有什么问题,我该怎么做?

I know that the df.D > 1 is causing the problem, and comparing columns directly to D is valid (df > df.D for example). What is wrong with my syntax when trying to compare D to the value 1, and how could I accomplish this?

推荐答案

应该直接起作用,但是熊猫没有广播和运算符(发生在0.14中).这是一种解决方法.

This should work directly, but pandas doesn't have a broadcasting and operator (will happenin 0.14). Here's a workaround.

In [74]: df
Out[74]: 
     A  B  C  D
1/1  0  1  0  1
1/2  2  1  1  1
1/3  3  0  1  0
1/4  1  0  1  2
1/5  1  0  1  1
1/6  2  0  2  1
1/7  3  5  2  3

这是一个where操作,本质上将np.nan放在条件为False的地方

This is a where operation, essentially put np.nan where the condition is False

In [78]: x = df[df>df.shift(1)]

In [79]: x
Out[79]: 
      A   B   C   D
1/1 NaN NaN NaN NaN
1/2   2 NaN   1 NaN
1/3   3 NaN NaN NaN
1/4 NaN NaN NaN   2
1/5 NaN NaN NaN NaN
1/6   2 NaN   2 NaN
1/7   3   5 NaN   3

根据第二个条件选择

In [80]: x[df.D>1]
Out[80]: 
      A   B   C  D
1/4 NaN NaN NaN  2
1/7   3   5 NaN  3

这篇关于如何在Pandas DataFrame where子句中使用特定列的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆