如何根据列值从DataFrame中选择行 [英] How to select rows from a DataFrame based on column values

查看：417 发布时间：2020/10/16 20:43:44 python pandas dataframe

本文介绍了如何根据列值从DataFrame中选择行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

如何基于Pandas中某些列的值从 DataFrame 中选择行？

How can I select rows from a DataFrame based on values in some column in Pandas?

在SQL中，我将使用：

In SQL, I would use:

SELECT *
FROM table
WHERE colume_name = some_value

我尝试查看熊猫的文档，但没有立即找到答案。

I tried to look at Pandas' documentation, but I did not immediately find the answer.

推荐答案

要选择列值等于标量 some_value 的行，请使用 == ：

To select rows whose column value equals a scalar, some_value, use ==:

df.loc[df['column_name'] == some_value]

要选择列值为可迭代的行， some_values ，请使用 isin ：

To select rows whose column value is in an iterable, some_values, use isin:

df.loc[df['column_name'].isin(some_values)]

用& ：


df.loc[(df['column_name'] >= A) & (df['column_name'] <= B)]

请注意括号。由于Python的运算符优先级规则， & 的绑定比< = 和> = 紧密。因此，最后一个示例中的括号是必需的。没有括号
Note the parentheses. Due to Python's operator precedence rules, & binds more tightly than <= and >=. Thus, the parentheses in the last example are necessary. Without the parentheses 
df['column_name'] >= A & df['column_name'] <= B

被解析为
df['column_name'] >= (A & df['column_name']) <= B

会导致。

which results in a Truth value of a Series is ambiguous error.
要选择列值不相等的行  some_value ，请使用！= ：
To select rows whose column value does not equal some_value, use !=:
df.loc[df['column_name'] != some_value]

  isin 返回布尔序列，因此选择 some_values 中的值为 not 的行，使用〜取反布尔系列：
isin returns a boolean Series, so to select rows whose value is not in some_values, negate the boolean Series using ~:
df.loc[~df['column_name'].isin(some_values)]

 
 
 
 
 
 例如，




For example,
import pandas as pd
import numpy as np
df = pd.DataFrame({'A': 'foo bar foo bar foo bar foo foo'.split(),
                   'B': 'one one two three two two one three'.split(),
                   'C': np.arange(8), 'D': np.arange(8) * 2})
print(df)
#      A      B  C   D
# 0  foo    one  0   0
# 1  bar    one  1   2
# 2  foo    two  2   4
# 3  bar  three  3   6
# 4  foo    two  4   8
# 5  bar    two  5  10
# 6  foo    one  6  12
# 7  foo  three  7  14

print(df.loc[df['A'] == 'foo'])

收益率
     A      B  C   D
0  foo    one  0   0
2  foo    two  2   4
4  foo    two  4   8
6  foo    one  6  12
7  foo  three  7  14

 
 
 
 
 
 如果您有多个值包括在内，将它们放入
列表（或更普遍地说，是任何可迭代的列表），并使用 isin ：
print(df.loc[df['B'].isin(['one','three'])])

收益率
     A      B  C   D
0  foo    one  0   0
1  bar    one  1   2
3  bar  three  3   6
6  foo    one  6  12
7  foo  three  7  14

 
 
 
 
 
 但是请注意，如果您愿意如此多次，首先使
建立索引，然后使用 df.loc 更为有效：
df = df.set_index(['B'])
print(df.loc['one'])

收益
       A  C   D
B              
one  foo  0   0
one  bar  1   2
one  foo  6  12

，或包含索引中的多个值，请使用 df.index.isin ：
or, to include multiple values from the index use df.index.isin:
df.loc[df.index.isin(['one','two'])]

收益
       A  C   D
B              
one  foo  0   0
one  bar  1   2
two  foo  2   4
two  foo  4   8
two  bar  5  10
one  foo  6  12


                        这篇关于如何根据列值从DataFrame中选择行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何根据列值从DataFrame中选择行 [英] How to select rows from a DataFrame based on column values

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何根据列值从DataFrame中选择行 [英] How to select rows from a DataFrame based on column values

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭