返回在 pandas 数据框中出现最小值/最大值的索引/索引列表 [英] Return list of indices/index where a min/max value occurs in a pandas dataframe
问题描述
我想搜索熊猫DataFrame
的最小值.我需要整个数据框中的最小值(跨所有值),类似于df.min().min()
.但是,我还需要知道此值出现的位置的索引.
I'd like to search a pandas DataFrame
for minimum values. I need the min in the entire dataframe (across all values) analogous to df.min().min()
. However I also need the know the index of the location(s) where this value occurs.
我尝试了许多不同的方法:
I've tried a number of different approaches:
-
df.where(df == (df.min().min()))
, -
df.where(df == df.min().min()).notnull()
(源)和 -
val_mask = df == df.min().min(); df[val_mask]
(源).
df.where(df == (df.min().min()))
,df.where(df == df.min().min()).notnull()
(source) andval_mask = df == df.min().min(); df[val_mask]
(source).
这些返回非最小/布尔值的NaNs数据框,但我想不出一种方法来获取这些位置的(行,列).
These return a dataframe of NaNs on non-min/boolean values but I can't figure out a way to get the (row, col) of these locations.
是否有一种更优雅的方式来搜索数据帧的最小/最大值并返回包含事件所有位置的列表?
Is there a more elegant way of searching a dataframe for a min/max and returning a list containing all of the locations of the occurrence(s)?
import pandas as pd
keys = ['x', 'y', 'z']
vals = [[1,2,-1], [3,5,1], [4,2,3]]
data = dict(zip(keys,vals))
df = pd.DataFrame(data)
list_of_lowest = []
for column_name, column in df.iteritems():
if len(df[column == df.min().min()]) > 0:
print(column_name, column.where(column ==df.min().min()).dropna())
list_of_lowest.append([column_name, column.where(column ==df.min().min()).dropna()])
list_of_lowest
output: [['x', 2 -1.0
Name: x, dtype: float64]]
推荐答案
基于您的修订后的更新:
Based on your revised update:
In [209]:
keys = ['x', 'y', 'z']
vals = [[1,2,-1], [3,5,-1], [4,2,3]]
data = dict(zip(keys,vals))
df = pd.DataFrame(data)
df
Out[209]:
x y z
0 1 3 4
1 2 5 2
2 -1 -1 3
然后执行以下操作:
In [211]:
df[df==df.min().min()].dropna(axis=1, thresh=1).dropna()
Out[211]:
x y
2 -1.0 -1.0
因此这在df上使用了布尔掩码:
So this uses the boolean mask on the df:
In [212]:
df[df==df.min().min()]
Out[212]:
x y z
0 NaN NaN NaN
1 NaN NaN NaN
2 -1.0 -1.0 NaN
,我们用参数thresh=1
调用dropna
,这将删除没有至少1个非NaN值的列:
and we call dropna
with param thresh=1
this drops columns that don't have at least 1 non-NaN value:
In [213]:
df[df==df.min().min()].dropna(axis=1, thresh=1)
Out[213]:
x y
0 NaN NaN
1 NaN NaN
2 -1.0 -1.0
使用thresh=1
再次呼叫可能更安全:
Probably safer to call again with thresh=1
:
In [214]:
df[df==df.min().min()].dropna(axis=1, thresh=1).dropna(thresh=1)
Out[214]:
x y
2 -1.0 -1.0
这篇关于返回在 pandas 数据框中出现最小值/最大值的索引/索引列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!