多列上的Pandas DataFrame布尔掩码 [英] Pandas dataframe boolean mask on multiple columns
问题描述
我有一个数据帧(df),其中包含几列具有实际测量值的列,而这些列中的每一列都具有不确定度(dA,dB,...)的相应列数(A,B,...): /p>
I have a dataframe (df) containing several columns with an actual measure and corresponding number of columns (A,B,...) with an uncertainty (dA, dB, ...) for each of these columns:
A B dA dB
0 -1 3 0.31 0.08
1 2 -4 0.263 0.357
2 5 5 0.382 0.397
3 -4 -0.5 0.33 0.115
我应用了一个函数来根据我的定义在测量列中查找有效的值
I apply a function to find values in the measurement columns that are valid according to my definition
df[["A","B"]].apply(lambda x: x.abs()-5*df['d'+x.name] > 0)
这将返回一个布尔数组:
This will return a boolean array:
A B
0 False True
1 True True
2 True True
3 True False
我想使用此数组在单个列中选择条件为true的数据帧中的行,例如 A ->第 1-3 行,还可以找到所有输入列都满足条件的行,例如 1 和 2 行. 有没有一种有效的方法可以对付大熊猫?
I would like to use this array to select rows in dataframe for which the condition is true within a single column, e.g. A -> row 1-3, and also find rows where the condition is true for all the input columns, e.g. row 1 and 2. Is there an efficient way to do this with pandas?
推荐答案
您可以使用apply语句的结果从原始数据框中布尔选择索引:
You can use the results of your apply statement to boolean index select from the original dataframe:
results = df[["A","B"]].apply(lambda x: x.abs()-5*df['d'+x.name] > 0)
哪个返回上面的布尔数组:
Which returns your boolean array above:
A B
0 False True
1 True True
2 True True
3 True False
现在,您可以使用此数组从原始数据名人中选择行,如下所示:
Now, you can use this array to select rows from your original datafame as follows:
选择A为True的地方:
Select where A is True:
df[results.A]
A B dA dB
1 2 -4.0 0.263 0.357
2 5 5.0 0.382 0.397
3 -4 -0.5 0.330 0.115
选择A或B为真的位置:
Select where either A or B are true:
df[results.any(axis=1)]
A B dA dB
0 -1 3.0 0.310 0.080
1 2 -4.0 0.263 0.357
2 5 5.0 0.382 0.397
3 -4 -0.5 0.330 0.115
选择所有列为真的位置:
Select where all the columns true:
df[results.all(axis=1)]
A B dA dB
1 2 -4.0 0.263 0.357
2 5 5.0 0.382 0.397
这篇关于多列上的Pandas DataFrame布尔掩码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!