根据 pandas 中多列的值从数据框中选择行 [英] Selecting rows from a Dataframe based on values from multiple columns in pandas
问题描述
这个问题与这两个问题非常相关另一个 和 thisone,我什至会使用这个问题的非常有用的公认解决方案中的示例.以下是已接受的解决方案中的示例(归功于 unutbu):
This question is very related to these two questions another and thisone, and I'll even use the example from the very helpful accepted solution on that question. Here's the example from the accepted solution (credit to unutbu):
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
'B': 'one one two three two two one three'.split(),
'C': np.arange(8), 'D': np.arange(8) * 2})
print(df)
# A B C D
# 0 foo one 0 0
# 1 bar one 1 2
# 2 foo two 2 4
# 3 bar three 3 6
# 4 foo two 4 8
# 5 bar two 5 10
# 6 foo one 6 12
# 7 foo three 7 14
print(df.loc[df['A'] == 'foo'])
收益
A B C D
0 foo one 0 0
2 foo two 2 4
4 foo two 4 8
6 foo one 6 12
7 foo three 7 14
但我想拥有 A 的所有行,并且只有 B 中包含两个"的箭头.我的尝试是尝试
But I want to have all rows of A and only the arrows in B that have 'two' in them. My attempt at it is to try
print(df.loc[df['A']) & df['B'] == 'two'])
不幸的是,这不起作用.任何人都可以建议一种方法来实现这样的事情吗?如果解决方案有点通用,例如列 A 没有相同的值,即 'foo' 但具有不同的值,并且您仍然想要整个列,那将会有很大帮助.
This does not work, unfortunately. Can anybody suggest a way to implement something like this? it would be of a great help if the solution is somewhat general where for example column A doesn't have the same value which is 'foo' but has different values and you still want the whole column.
推荐答案
我认为我理解你修改后的问题.在B
的条件下进行子选择后,就可以选择你想要的列,如:
I think I understand your modified question. After sub-selecting on a condition of B
, then you can select the columns you want, such as:
In [1]: df.loc[df.B =='two'][['A', 'B']]
Out[1]:
A B
2 foo two
4 foo two
5 bar two
例如,如果我想连接 A 列的所有字符串,其中 B 列的值为 'two'
,那么我可以这样做:
For example, if I wanted to concatenate all the string of column A, for which column B had value 'two'
, then I could do:
In [2]: df.loc[df.B =='two'].A.sum() # <-- use .mean() for your quarterly data
Out[2]: 'foofoobar'
您还可以groupby
B 列的值,并从一个表达式中为每个不同的 B-group 获得这样的连接结果:
You could also groupby
the values of column B and get such a concatenation result for every different B-group from one expression:
In [3]: df.groupby('B').apply(lambda x: x.A.sum())
Out[3]:
B
one foobarfoo
three barfoo
two foofoobar
dtype: object
要过滤 A
和 B
使用 numpy.logical_and
:
To filter on A
and B
use numpy.logical_and
:
In [1]: df.loc[np.logical_and(df.A == 'foo', df.B == 'two')]
Out[1]:
A B C D
2 foo two 2 4
4 foo two 4 8
这篇关于根据 pandas 中多列的值从数据框中选择行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!