NumPy检查2D数组是否是2D数组的子集 [英] NumPy check if 2D array is subset of 2D array

查看:94
本文介绍了NumPy检查2D数组是否是2D数组的子集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想检查数组 b 是否是数组 a 的子集.通过子集,我的意思是我想检查是否在 a 中找到了 b 的所有元素.

I would like to check if the array b is a subset of the array a. By subset I mean I would like to check if all the elements of b are found in a.

这是我的代码:

import numpy as np
a = np.array([[1,7,9],[8,3,12],[101,-74,0.5]])
b = np.array([[1,9],[8,12],[101,0.5]])
print a
print b

这是输出

数组a

[[   1.     7.     9. ]
 [   8.     3.    12. ]
 [ 101.   -74.     0.5]]

数组b

[[   1.     9. ]
 [   8.    12. ]
 [ 101.     0.5]]

有没有一种方法可以检查b是否是a的子集?

Is there a way to check if b is a subset of a?

附加信息:

根据下面的评论,我需要澄清一下,我需要知道数组b是否是数组a的子集-如果子集中甚至缺少一个元素,那么我正在寻找一种方法来检查这一点.我不需要指示该元素在子集中丢失的位置,而只是知道它丢失了.如果可以提供有关缺失元素的其他信息,那将是一个好处,但这不是硬性要求.抱歉,我们不早解决此问题.

As per comments below, I should clarify that I need to know if array b is a subset of array a - if even one element is missing from the subset, then I am looking for a way to check for this. I do not need to have an indication of where in the subset the element is missing but just to know it is missing. If additional information can be provided about the missing element then that will be a bonus but it is not a hard requirement. Apologies for not clearing this up earlier.

我将问题表述为子集的原因是,如果一个数组是另一个数组的子集,那么这对我来说意味着子集数组的所有值都存在于较大的数组中.

My reasoning in phrasing the question as a subset is that if one array is a subset of the other array then this would imply to me that all the values of the subset array are present in the larger array.

推荐答案

这应该有效:

set(np.unique(b)).issubset(set(np.unique(a)))


编辑:上面的代码返回 True False ,而不是布尔向量的列向量.从@Eelco Hoogendoorn的评论到您的问题,我了解到您实际上有兴趣检查 b 是否是相应 a ,对吗?假设这是正确的问题描述,那么以下一线工作即可:


EDIT: The code above returns True or False rather than a column vector of booleans. From @Eelco Hoogendoorn's comment to your question, I understand that you are actually interested in checking whether a row of b is a subset of the corresponding row of a, right? Assuming that this is the correct problem description, the following one-liner should work:

np.array([[set(bi).issubset(set(ai))] for ai, bi in zip(map(tuple, a), map(tuple, b))])

上面的代码简单,易读,并且不需要第三方依赖性.诚然,这是一种快速而肮脏的解决方案,因为正如@Bi Rico正确指出的那样,这种方法效率很低.如果您需要处理大型数组,则应遵循 vectorized 算法.

The code above is simple, readable, and does not require third party dependencies. It is admittedly a quick and dirty solution, since as @Bi Rico correctly pointed out, such an approach can be pretty inefficient. If you need to handle large arrays you should stick to a vectorized algorithm.

这篇关于NumPy检查2D数组是否是2D数组的子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆