Numpy:对于一个数组中的每个元素,找到另一个数组中的索引 [英] Numpy: For every element in one array, find the index in another array

查看:43
本文介绍了Numpy:对于一个数组中的每个元素,找到另一个数组中的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个一维数组,x &y,一个比另一个小.我试图找到 x 中 y 的每个元素的索引.

I have two 1D arrays, x & y, one smaller than the other. I'm trying to find the index of every element of y in x.

我发现了两种简单的方法来做到这一点,第一种很慢,第二种很占用内存.

I've found two naive ways to do this, the first is slow, and the second memory-intensive.

indices= []
for iy in y:
    indices += np.where(x==iy)[0][0]

记忆猪

xe = np.outer([1,]*len(x), y)
ye = np.outer(x, [1,]*len(y))
junk, indices = np.where(np.equal(xe, ye))

是否有更快的方法或更少的内存密集型方法?理想情况下,搜索将利用这样一个事实,即我们在列表中搜索的不是一件东西,而是许多东西,因此更适合并行化.如果您不假设 y 的每个元素实际上都在 x 中,则加分.

Is there a faster way or less memory intensive approach? Ideally the search would take advantage of the fact that we are searching for not one thing in a list, but many things, and thus is slightly more amenable to parallelization. Bonus points if you don't assume that every element of y is actually in x.

推荐答案

正如 Joe Kington 所说,searchsorted() 可以非常快速地搜索元素.处理不在x中的元素,可以用原来的y检查搜索结果,并创建一个掩码数组:

As Joe Kington said, searchsorted() can search element very quickly. To deal with elements that are not in x, you can check the searched result with original y, and create a masked array:

import numpy as np
x = np.array([3,5,7,1,9,8,6,6])
y = np.array([2,1,5,10,100,6])

index = np.argsort(x)
sorted_x = x[index]
sorted_index = np.searchsorted(sorted_x, y)

yindex = np.take(index, sorted_index, mode="clip")
mask = x[yindex] != y

result = np.ma.array(yindex, mask=mask)
print result

结果是:

[-- 3 1 -- -- 6]

这篇关于Numpy:对于一个数组中的每个元素,找到另一个数组中的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆