检查numpy数组是否是另一个数组的子集 [英] check if numpy array is subset of another array
问题描述
类似的问题已经在SO上提出,但是它们有更具体的限制,其答案不适用于我的问题.
Similar questions have already been asked on SO, but they have more specific constraints and their answers don't apply to my question.
一般而言,确定任意numpy数组是否为另一个数组的子集的最有效方法是什么?更具体地说,我有一个大约20000x3的数组,我需要知道完全包含在集合中的1x3元素的索引.更普遍的是,是否有更Python化的方式来编写以下内容:
Generally speaking, what is the most pythonic way to determine if an arbitrary numpy array is a subset of another array? More specifically, I have a roughly 20000x3 array and I need to know the indices of the 1x3 elements that are entirely contained within a set. More generally, is there a more pythonic way of writing the following:
master=[12,155,179,234,670,981,1054,1209,1526,1667,1853] #some indices of interest
triangles=np.random.randint(2000,size=(20000,3)) #some data
for i,x in enumerate(triangles):
if x[0] in master and x[1] in master and x[2] in master:
print i
对于我的用例,我可以安全地假设len(master)<< 20000.(因此,可以安全地假定master是有序的,因为这很便宜).
For my use case, I can safely assume that len(master) << 20000. (Consequently, it is also safe to assume that master is sorted because this is cheap).
推荐答案
您可以通过迭代列表推导中的数组来轻松实现此目的.一个玩具示例如下:
You can do this easily via iterating over an array in list comprehension. A toy example is as follows:
import numpy as np
x = np.arange(30).reshape(10,3)
searchKey = [4,5,8]
x[[0,3,7],:] = searchKey
x
给予
array([[ 4, 5, 8],
[ 3, 4, 5],
[ 6, 7, 8],
[ 4, 5, 8],
[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[ 4, 5, 8],
[24, 25, 26],
[27, 28, 29]])
现在遍历元素:
ismember = [row==searchKey for row in x.tolist()]
结果是
[True, False, False, True, False, False, False, True, False, False]
您可以将其修改为问题中的子集:
You can modify it for being a subset as in your question:
searchKey = [2,4,10,5,8,9] # Add more elements for testing
setSearchKey = set(searchKey)
ismember = [setSearchKey.issuperset(row) for row in x.tolist()]
如果需要索引,请使用
np.where(ismember)[0]
它给出了
array([0, 3, 7])
这篇关于检查numpy数组是否是另一个数组的子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!