如何找到重新排序的numpy数组的索引? [英] How to find indices of a reordered numpy array?

查看:236
本文介绍了如何找到重新排序的numpy数组的索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个排序的numpy数组:

Say I have a sorted numpy array:

arr = np.array([0.0, 0.0],
               [0.5, 0.0],
               [1.0, 0.0],
               [0.0, 0.5],
               [0.5, 0.5],
               [1.0, 0.5],
               [0.0, 1.0],
               [0.5, 1.0],
               [1.0, 1.0])

并假设我对其进行了非平凡的操作,以便拥有一个新数组,该数组与旧数组相同,但顺序相反:

and suppose I make a non trivial operation on it such that I have a new array which is the same as the old one but in another order:

arr2 = np.array([0.5, 0.0],
                [0.0, 0.0],
                [0.0, 0.5],
                [1.0, 0.0],
                [0.5, 0.5],
                [1.0, 0.5],
                [0.0, 1.0],
                [1.0, 1.0],
                [0.5, 1.0])

问题是:如何获取arr2的每个元素在arr中的放置位置的索引.换句话说,我想要一个方法,它同时接收两个数组并返回与arr2相同长度但具有arr元素索引的数组.例如,返回数组的第一个元素将是arrarr2的第一个元素的索引.

The question is: how do you get the indices of where each element of arr2 are placed in arr. In other terms, I want a method that takes both arrays and return an array the same length as arr2 but with the index of the element of arr. For example, the first element of the returned array would be the index of the first element of arr2 in arr.

where_things_are(arr2, arr) 
return : array([1, 0, 3, 2, 4, 5, 6, 8, 7])

这样的函数在numpy中已经存在吗?

Does a function like this already exists in numpy?

我尝试过:

np.array([np.where((arr == x).all(axis=1)) for x in arr2])

返回我想要的内容,但我的问题仍然存在:是否有使用numpy方法执行此操作的更有效方法?

which returns what I want, but my question still holds: is there a more efficient way of doing this using numpy methods?

如果arr2的长度与原始数组的长度不同(例如我从中删除了一些元素),它也应该起作用.因此,它不是查找和反转排列,而是查找元素位于何处.

It should also work if the length of arr2 is not the same as the length of the original array (like if I removed some elements from it). Thus it is not finding and inverting a permutation but rather finding where elements are located at.

推荐答案

关键是反转排列.即使原始数组未排序,下面的代码也可以工作.如果将其排序,则可以使用find_map_sorted,显然更快.

The key is inverting permutations. The code below works even if the original array is not sorted. If it is sorted then find_map_sorted can be used which obviously is faster.

更新:为适应OP不断变化的要求,我添加了一个分支来处理丢失的元素.

UPDATE: Adapting to the OP's ever changing requirements, I've added a branch that handles lost elements.

import numpy as np

def invperm(p):
    q = np.empty_like(p)
    q[p] = np.arange(len(p))
    return q

def find_map(arr1, arr2):
    o1 = np.argsort(arr1)
    o2 = np.argsort(arr2)
    return o2[invperm(o1)]

def find_map_2d(arr1, arr2):
    o1 = np.lexsort(arr1.T)
    o2 = np.lexsort(arr2.T)
    return o2[invperm(o1)]

def find_map_sorted(arr1, arrs=None):
    if arrs is None:
        o1 = np.lexsort(arr1.T)
        return invperm(o1)
    # make unique-able
    rdtype = np.rec.fromrecords(arrs[:1, ::-1]).dtype
    recstack = np.r_[arrs[:,::-1], arr1[:,::-1]].view(rdtype).view(np.recarray)
    uniq, inverse = np.unique(recstack, return_inverse=True)
    return inverse[len(arrs):]

x1 = np.random.permutation(100000)
x2 = np.random.permutation(100000)
print(np.all(x2[find_map(x1, x2)] == x1))

rows = np.random.random((100000, 8))
r1 = rows[x1, :]
r2 = rows[x2, :]
print(np.all(r2[find_map_2d(r1, r2)] == r1))

rs = r1[np.lexsort(r1.T), :]
print(np.all(rs[find_map_sorted(r2), :] == r2))

# lose ten elements
print(np.all(rs[find_map_sorted(r2[:-10], rs), :] == r2[:-10]))

这篇关于如何找到重新排序的numpy数组的索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆