根据条件获取Python Pandas中的第一行数据框 [英] Get first row of dataframe in Python Pandas based on criteria

查看:2045
本文介绍了根据条件获取Python Pandas中的第一行数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个像这样的数据框

Let's say that I have a dataframe like this one

import pandas as pd
df = pd.DataFrame([[1, 2, 1], [1, 3, 2], [4, 6, 3], [4, 3, 4], [5, 4, 5]], columns=['A', 'B', 'C'])

>> df
   A  B  C
0  1  2  1
1  1  3  2
2  4  6  3
3  4  3  4
4  5  4  5

原始表更加复杂,具有更多的列和行.

The original table is more complicated with more columns and rows.

我想获得满足某些条件的第一行.例子:

I want to get the first row that fulfil some criteria. Examples:

  1. 获取第一行,其中A> 3(返回第2行)
  2. 获取第一行,其中A> 4 AND B> 3(返回第4行)
  3. 获取第一行,其中A> 3 AND(B> 3 OR C> 2)(返回第2行)

但是,如果没有任何满足特定条件的行,那么我想在我将其按A降序排序(或者将其他情况按B,C等排序)后得到第一行

But, if there isn't any row that fulfil the specific criteria, then I want to get the first one after I just sort it descending by A (or other cases by B, C etc)

  1. 获取A> 6的第一行(按A desc的顺序返回第4行并获取第一行)

我能够通过遍历数据帧来做到这一点(我知道那胡扯:P).因此,我更喜欢使用Python方式解决此问题.

I was able to do it by iterating on the dataframe (I know that craps :P). So, I prefer a more pythonic way to solve it.

推荐答案

本教程是熊猫切片的很好的选择.确保您将其签出.在一些片段上...要使用条件对数据框进行切片,请使用以下格式:

This tutorial is a very good one for pandas slicing. Make sure you check it out. Onto some snippets... To slice a dataframe with a condition, you use this format:

>>> df[condition]

这将返回您的数据框的一部分,您可以使用iloc对其进行索引.这是您的示例:

This will return a slice of your dataframe which you can index using iloc. Here are your examples:

  1. 获取第一行,其中A> 3(返回第二行)

  1. Get first row where A > 3 (returns row 2)

>>> df[df.A > 3].iloc[0]
A    4
B    6
C    3
Name: 2, dtype: int64

如果您真正想要的是行号,而不是使用iloc,则应为df[df.A > 3].index[0].

If what you actually want is the row number, rather than using iloc, it would be df[df.A > 3].index[0].

  1. 获取第一行,其中A> 4 AND B> 3:

  1. Get first row where A > 4 AND B > 3:

>>> df[(df.A > 4) & (df.B > 3)].iloc[0]
A    5
B    4
C    5
Name: 4, dtype: int64

  • 获取第一行,其中A> 3 AND(B> 3 OR C> 2)(返回第2行)

  • Get first row where A > 3 AND (B > 3 OR C > 2) (returns row 2)

    >>> df[(df.A > 3) & ((df.B > 3) | (df.C > 2))].iloc[0]
    A    4
    B    6
    C    3
    Name: 2, dtype: int64
    

  • 现在,对于最后一种情况,我们可以编写一个函数来处理返回降序排列的帧的默认情况:

    Now, with your last case we can write a function that handles the default case of returning the descending-sorted frame:

    >>> def series_or_default(X, condition, default_col, ascending=False):
    ...     sliced = X[condition]
    ...     if sliced.shape[0] == 0:
    ...         return X.sort_values(default_col, ascending=ascending).iloc[0]
    ...     return sliced.iloc[0]
    >>> 
    >>> series_or_default(df, df.A > 6, 'A')
    A    5
    B    4
    C    5
    Name: 4, dtype: int64
    

    如预期的那样,它返回第4行.

    As expected, it returns row 4.

    这篇关于根据条件获取Python Pandas中的第一行数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆