返回在 pandas 数据框中出现最小值/最大值的索引/索引列表 [英] Return list of indices/index where a min/max value occurs in a pandas dataframe

查看:59
本文介绍了返回在 pandas 数据框中出现最小值/最大值的索引/索引列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想搜索熊猫DataFrame的最小值.我需要整个数据框中的最小值(跨所有值),类似于df.min().min().但是,我还需要知道此值出现的位置的索引.

I'd like to search a pandas DataFrame for minimum values. I need the min in the entire dataframe (across all values) analogous to df.min().min(). However I also need the know the index of the location(s) where this value occurs.

我尝试了许多不同的方法:

I've tried a number of different approaches:

  • df.where(df == (df.min().min()))
  • df.where(df == df.min().min()).notnull()()和
  • val_mask = df == df.min().min(); df[val_mask]().
  • df.where(df == (df.min().min())),
  • df.where(df == df.min().min()).notnull()(source) and
  • val_mask = df == df.min().min(); df[val_mask] (source).

这些返回非最小/布尔值的NaNs数据框,但我想不出一种方法来获取这些位置的(行,列).

These return a dataframe of NaNs on non-min/boolean values but I can't figure out a way to get the (row, col) of these locations.

是否有一种更优雅的方式来搜索数据帧的最小/最大值并返回包含事件所有位置的列表?

Is there a more elegant way of searching a dataframe for a min/max and returning a list containing all of the locations of the occurrence(s)?

import pandas as pd

keys = ['x', 'y', 'z']
vals = [[1,2,-1], [3,5,1], [4,2,3]]
data = dict(zip(keys,vals))
df = pd.DataFrame(data)

list_of_lowest = []

for column_name, column in df.iteritems():
    if len(df[column == df.min().min()]) > 0:
        print(column_name, column.where(column ==df.min().min()).dropna())
        list_of_lowest.append([column_name, column.where(column ==df.min().min()).dropna()])

list_of_lowest
output: [['x', 2   -1.0
Name: x, dtype: float64]]

推荐答案

基于您的修订后的更新:

Based on your revised update:

In [209]:
keys = ['x', 'y', 'z'] 
vals = [[1,2,-1], [3,5,-1], [4,2,3]] 
data = dict(zip(keys,vals)) 
df = pd.DataFrame(data)
df

Out[209]:
   x  y  z
0  1  3  4
1  2  5  2
2 -1 -1  3

然后执行以下操作:

In [211]:
df[df==df.min().min()].dropna(axis=1, thresh=1).dropna()

Out[211]:
     x    y
2 -1.0 -1.0

因此这在df上使用了布尔掩码:

So this uses the boolean mask on the df:

In [212]:
df[df==df.min().min()]

Out[212]:
     x    y   z
0  NaN  NaN NaN
1  NaN  NaN NaN
2 -1.0 -1.0 NaN

,我们用参数thresh=1调用dropna,这将删除没有至少1个非NaN值的列:

and we call dropna with param thresh=1 this drops columns that don't have at least 1 non-NaN value:

In [213]:
df[df==df.min().min()].dropna(axis=1, thresh=1)

Out[213]:
     x    y
0  NaN  NaN
1  NaN  NaN
2 -1.0 -1.0

使用thresh=1再次呼叫可能更安全:

Probably safer to call again with thresh=1:

In [214]:
df[df==df.min().min()].dropna(axis=1, thresh=1).dropna(thresh=1)

Out[214]:
     x    y
2 -1.0 -1.0

这篇关于返回在 pandas 数据框中出现最小值/最大值的索引/索引列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆