获取符合某些条件的 Pandas DataFrame 的列和行索引对 [英] Get column and row index pairs of Pandas DataFrame matching some criteria
本文介绍了获取符合某些条件的 Pandas DataFrame 的列和行索引对的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有一个 Pandas DataFrame
,如下所示.这些值基于距离矩阵.
Suppose I have a Pandas DataFrame
like following. These values are based on a distance matrix.
A = pd.DataFrame([(1.0,0.8,0.6708203932499369,0.6761234037828132,0.7302967433402214),
(0.8,1.0,0.6708203932499369,0.8451542547285166,0.9128709291752769),
(0.6708203932499369,0.6708203932499369,1.0,0.5669467095138409,0.6123724356957946),
(0.6761234037828132,0.8451542547285166,0.5669467095138409,1.0,0.9258200997725514),
(0.7302967433402214,0.9128709291752769,0.6123724356957946,0.9258200997725514,1.0)
])
输出:
Out[65]:
0 1 2 3 4
0 1.000000 0.800000 0.670820 0.676123 0.730297
1 0.800000 1.000000 0.670820 0.845154 0.912871
2 0.670820 0.670820 1.000000 0.566947 0.612372
3 0.676123 0.845154 0.566947 1.000000 0.925820
4 0.730297 0.912871 0.612372 0.925820 1.000000
我只想要上面的三角形.
I want only the upper triangle.
c2 = A.copy()
c2.values[np.tril_indices_from(c2)] = np.nan
输出:
Out[67]:
0 1 2 3 4
0 NaN 0.8 0.67082 0.676123 0.730297
1 NaN NaN 0.67082 0.845154 0.912871
2 NaN NaN NaN 0.566947 0.612372
3 NaN NaN NaN NaN 0.925820
4 NaN NaN NaN NaN NaN
现在我想根据某些条件获取列和行索引对.例如:获取值大于 0.8 的列和行索引.为此,输出应为 [1,3],[1,4],[3,4]
.对此有什么帮助吗?
Now I want to get column and row index pairs based on some criteria.
Eg : Get column and row indexes where value is greater than 0.8. For this the out put should be [1,3],[1,4],[3,4]
. Any help on this?
推荐答案
你可以使用 numpy 的 argwhere:
You can use numpy's argwhere:
In [11]: np.argwhere(c2 > 0.8)
Out[11]:
array([[1, 3],
[1, 4],
[3, 4]])
要获取索引/列(而不是它们的整数位置),您可以使用列表理解:
To get the index/columns (rather than their integer locations), you could use a list comprehension:
[(c2.index[i], c2.columns[j]) for i, j in np.argwhere(c2 > 0.8)]
这篇关于获取符合某些条件的 Pandas DataFrame 的列和行索引对的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文