获取符合某些条件的 Pandas DataFrame 的列和行索引对 [英] Get column and row index pairs of Pandas DataFrame matching some criteria

查看:99
本文介绍了获取符合某些条件的 Pandas DataFrame 的列和行索引对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个 Pandas DataFrame,如下所示.这些值基于距离矩阵.

Suppose I have a Pandas DataFrame like following. These values are based on a distance matrix.

A = pd.DataFrame([(1.0,0.8,0.6708203932499369,0.6761234037828132,0.7302967433402214),
                  (0.8,1.0,0.6708203932499369,0.8451542547285166,0.9128709291752769),
        (0.6708203932499369,0.6708203932499369,1.0,0.5669467095138409,0.6123724356957946),
        (0.6761234037828132,0.8451542547285166,0.5669467095138409,1.0,0.9258200997725514),
        (0.7302967433402214,0.9128709291752769,0.6123724356957946,0.9258200997725514,1.0)
                  ])

输出:

Out[65]: 
          0         1         2         3         4
0  1.000000  0.800000  0.670820  0.676123  0.730297
1  0.800000  1.000000  0.670820  0.845154  0.912871
2  0.670820  0.670820  1.000000  0.566947  0.612372
3  0.676123  0.845154  0.566947  1.000000  0.925820
4  0.730297  0.912871  0.612372  0.925820  1.000000

我只想要上面的三角形.

I want only the upper triangle.

c2 = A.copy()
c2.values[np.tril_indices_from(c2)] = np.nan

输出:

Out[67]: 

        0    1        2         3         4
    0 NaN  0.8  0.67082  0.676123  0.730297
    1 NaN  NaN  0.67082  0.845154  0.912871
    2 NaN  NaN      NaN  0.566947  0.612372
    3 NaN  NaN      NaN       NaN  0.925820
    4 NaN  NaN      NaN       NaN       NaN

现在我想根据某些条件获取列和行索引对.例如:获取值大于 0.8 的列和行索引.为此,输出应为 [1,3],[1,4],[3,4].对此有什么帮助吗?

Now I want to get column and row index pairs based on some criteria. Eg : Get column and row indexes where value is greater than 0.8. For this the out put should be [1,3],[1,4],[3,4]. Any help on this?

推荐答案

你可以使用 numpy 的 argwhere:

You can use numpy's argwhere:

In [11]: np.argwhere(c2 > 0.8)
Out[11]: 
array([[1, 3],
       [1, 4],
       [3, 4]])

要获取索引/列(而不是它们的整数位置),您可以使用列表理解:

To get the index/columns (rather than their integer locations), you could use a list comprehension:

[(c2.index[i], c2.columns[j]) for i, j in np.argwhere(c2 > 0.8)]

这篇关于获取符合某些条件的 Pandas DataFrame 的列和行索引对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆