如何在两个以上维度中使用numpy.argsort()作为索引? [英] How to use numpy.argsort() as indices in more than 2 dimensions?

查看:165
本文介绍了如何在两个以上维度中使用numpy.argsort()作为索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道类似这个问题的问题已经被问了很多遍了,但是给出类似问题的所有答案似乎只适用于二维数组.

我对np.argsort()的理解是np.sort(array) == array[np.argsort(array)]应该是True. 我发现如果np.ndim(array) == 2确实是正确的,但是如果np.ndim(array) > 2则会给出不同的结果.

示例:

>>> array = np.array([[[ 0.81774634,  0.62078744],
                       [ 0.43912609,  0.29718462]],
                      [[ 0.1266578 ,  0.82282054],
                       [ 0.98180375,  0.79134389]]])
>>> np.sort(array)
array([[[ 0.62078744,  0.81774634],
        [ 0.29718462,  0.43912609]],

       [[ 0.1266578 ,  0.82282054],
        [ 0.79134389,  0.98180375]]])
>>> array.argsort()
array([[[1, 0],
        [1, 0]],

       [[0, 1],
        [1, 0]]])
>>> array[array.argsort()]
array([[[[[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]],

         [[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]]],


        [[[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]],

         [[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]]]],



       [[[[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]],

         [[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]]],


        [[[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]],

         [[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]]]]])

因此,有人可以向我解释np.argsort()到底可以怎样精确地用作获得排序数组的索引吗? 我唯一能想到的方法是:

args = np.argsort(array)
array_sort = np.zeros_like(array)
for i in range(array.shape[0]):
    for j in range(array.shape[1]):
        array_sort[i, j] = array[i, j, args[i, j]]

这非常繁琐,无法针对任何给定数量的尺寸进行概括.

解决方案

这是一种通用方法:

import numpy as np

array = np.array([[[ 0.81774634,  0.62078744],
                   [ 0.43912609,  0.29718462]],
                  [[ 0.1266578 ,  0.82282054],
                   [ 0.98180375,  0.79134389]]])

a = 1 # or 0 or 2

order = array.argsort(axis=a)

idx = np.ogrid[tuple(map(slice, array.shape))]
# if you don't need full ND generality: in 3D this can be written
# much more readable as
# m, n, k = array.shape
# idx = np.ogrid[:m, :n, :k]

idx[a] = order

print(np.all(array[idx] == np.sort(array, axis=a)))

输出:

True

说明:我们必须为输出数组的每个元素指定输入数组的相应元素的完整索引.因此,输入数组中的每个索引都具有与输出数组相同的形状,或者必须可以广播到该形状.

我们未进行排序/argsort排序的轴的索引保持不变.因此,我们需要为每个传递一个可广播范围(array.shape [i]).最简单的方法是使用ogrid为所有维创建这样的范围(如果直接使用此范围,则数组将返回不变.),然后用argsort的输出替换与排序轴对应的索引.

2019年3月更新:

在强制将多轴索引作为元组传递时,Numpy变得越来越严格.当前,array[idx]将触发弃用警告.为了将来使用,请使用array[tuple(idx)]. (感谢@Nathan)

或者使用numpy的新功能(版本1.15.0)take_along_axis:

np.take_along_axis(array, order, a)

I know something similar to this question has been asked many times over already, but all answers given to similar questions only seem to work for arrays with 2 dimensions.

My understanding of np.argsort() is that np.sort(array) == array[np.argsort(array)] should be True. I have found out that this is indeed correct if np.ndim(array) == 2, but it gives different results if np.ndim(array) > 2.

Example:

>>> array = np.array([[[ 0.81774634,  0.62078744],
                       [ 0.43912609,  0.29718462]],
                      [[ 0.1266578 ,  0.82282054],
                       [ 0.98180375,  0.79134389]]])
>>> np.sort(array)
array([[[ 0.62078744,  0.81774634],
        [ 0.29718462,  0.43912609]],

       [[ 0.1266578 ,  0.82282054],
        [ 0.79134389,  0.98180375]]])
>>> array.argsort()
array([[[1, 0],
        [1, 0]],

       [[0, 1],
        [1, 0]]])
>>> array[array.argsort()]
array([[[[[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]],

         [[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]]],


        [[[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]],

         [[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]]]],



       [[[[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]],

         [[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]]],


        [[[ 0.1266578 ,  0.82282054],
          [ 0.98180375,  0.79134389]],

         [[ 0.81774634,  0.62078744],
          [ 0.43912609,  0.29718462]]]]])

So, can anybody explain to me how exactly np.argsort() can be used as the indices to obtain the sorted array? The only way I can come up with is:

args = np.argsort(array)
array_sort = np.zeros_like(array)
for i in range(array.shape[0]):
    for j in range(array.shape[1]):
        array_sort[i, j] = array[i, j, args[i, j]]

which is extremely tedious and cannot be generalized for any given number of dimensions.

解决方案

Here is a general method:

import numpy as np

array = np.array([[[ 0.81774634,  0.62078744],
                   [ 0.43912609,  0.29718462]],
                  [[ 0.1266578 ,  0.82282054],
                   [ 0.98180375,  0.79134389]]])

a = 1 # or 0 or 2

order = array.argsort(axis=a)

idx = np.ogrid[tuple(map(slice, array.shape))]
# if you don't need full ND generality: in 3D this can be written
# much more readable as
# m, n, k = array.shape
# idx = np.ogrid[:m, :n, :k]

idx[a] = order

print(np.all(array[idx] == np.sort(array, axis=a)))

Output:

True

Explanation: We must specify for each element of the output array the complete index of the corresponding element of the input array. Thus each index into the input array has the same shape as the output array or must be broadcastable to that shape.

The indices for the axes along which we do not sort/argsort stay in place. We therefore need to pass a broadcastable range(array.shape[i]) for each of those. The easiest way is to use ogrid to create such a range for all dimensions (If we used this directly, the array would come back unchanged.) and then replace the index correspondingg to the sort axis with the output of argsort.

UPDATE March 2019:

Numpy is becoming more strict in enforcing multi-axis indices being passed as tuples. Currently, array[idx] will trigger a deprecation warning. To be future proof use array[tuple(idx)] instead. (Thanks @Nathan)

Or use numpy's new (version 1.15.0) convenience function take_along_axis:

np.take_along_axis(array, order, a)

这篇关于如何在两个以上维度中使用numpy.argsort()作为索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆