获取Pandas数据帧中所选值的行和列标签 [英] Get the row and column labels for selected values in a Pandas dataframe

查看:2529
本文介绍了获取Pandas数据帧中所选值的行和列标签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获得与数据帧中某些条件匹配的值的行和列标签。为了保持它的趣味性,我需要它使用分层(多)索引。例如:

I'd like to get the row and column labels for values matching some condition in a dataframe. Just to keep it interesting, I need it to work with a hierarchical (multi-)index. For example:

df = pd.DataFrame(np.arange(16).reshape(4, 4), columns=pd.MultiIndex.from_product((('a', 'b'), ('x', 'y'))))

    a       b    
    x   y   x   y
0   0   1   2   3
1   4   5   6   7
2   8   9  10  11
3  12  13  14  15

现在假设我想要元素的行和列标签

Now let's say I want the row and column labels of the elements where

df % 6 == 0

       a             b       
       x      y      x      y
0   True  False  False  False
1  False  False   True  False
2  False  False  False  False
3   True  False  False  False

我想得到

[(0, ('a', 'x')), (1, ('b', 'x')), (3, ('a', 'x'))]

请注意我想要一个一般解决方案,它不依赖于索引是单调的,或特定的选择在我的例子中。这个问题已被多次询问,但答案并未概括:

Please note I would like a general solution, that does not rely on the index being monotonic, or the particular selection in my example. This questions has been asked many times, but the answers do not generalize:

  • index and column for the max value in pandas dataframe: relies on sorting to find max
  • Pandas dataframe: return row AND column of maximum value(s): does not generalize
  • Retrieve indices of NaN values in a pandas dataframe: does not return row label
  • Return list of indices/index where a min/max value occurs in a pandas dataframe: does not generalize
  • Pandas: Get each value's index and columns values: does not generalize / uses iteration

Pandas真的这么难吗?

Is this really so hard in Pandas?

推荐答案

使用 np.where 获取True值的序数索引:

Use np.where to obtain the ordinal indices of the True values:

import numpy as np
import pandas as pd
df = pd.DataFrame(np.arange(16).reshape(4, 4), 
                  columns=pd.MultiIndex.from_product((('a', 'b'), ('x', 'y'))))

mask = (df % 6 == 0)
i, j = np.where(mask)
print(list(zip(df.index[i], df.columns[j])))

收益率

[(0, ('a', 'x')), (1, ('b', 'x')), (3, ('a', 'x'))]

这篇关于获取Pandas数据帧中所选值的行和列标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆