在两个二维数组中查找匹配行的索引 [英] Find indexes of matching rows in two 2-D arrays

查看:473
本文介绍了在两个二维数组中查找匹配行的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有两个二维数组,如下所示:

Suppose that I have two 2-D arrays as follows:

array([[3, 3, 1, 0],
       [2, 3, 1, 3],
       [0, 2, 3, 1],
       [1, 0, 2, 3],
       [3, 1, 0, 2]], dtype=int8)

array([[0, 3, 3, 1],
       [0, 2, 3, 1],
       [1, 0, 2, 3],
       [3, 1, 0, 2],
       [3, 3, 1, 0]], dtype=int8)

每个数组中的某些行都有一个对应的行,该行按值匹配(但不一定按索引),另一个行中的匹配.

Some rows in each array have a corresponding row that matches by value (but not necessarily by index) in the other array, and some don't.

我想找到一种有效的方法来返回与匹配行相对应的两个数组中的索引对.如果他们要成为元组,我期望会返回

I would like to find an efficient way to return pairs of indexes in the two arrays that correspond to matching rows. If they were to be tuples I would expect to return

(0,4)
(2,1)
(3,2)
(4,3)

推荐答案

这是全部numpy的解决方案-不一定比迭代的Python更好.它仍然必须查看所有组合.

This is an all numpy solution - not that is necessarily better than an iterative Python one. It still has to look at all combinations.

In [53]: np.array(np.all((x[:,None,:]==y[None,:,:]),axis=-1).nonzero()).T.tolist()
Out[53]: [[0, 4], [2, 1], [3, 2], [4, 3]]

中间数组是(5,5,4). np.all将其减少为:

The intermediate array is (5,5,4). The np.all reduces it to:

array([[False, False, False, False,  True],
       [False, False, False, False, False],
       [False,  True, False, False, False],
       [False, False,  True, False, False],
       [False, False, False,  True, False]], dtype=bool)

其余的只是提取索引为True

The rest is just extracting the indices where this is True

在粗略测试中,这个时间是47.8美元;另一个答案是在38.3 us时使用L1词典;第三个在496 us有一个双循环.

In crude tests, this times at 47.8 us; the other answer with the L1 dictionary at 38.3 us; and a third with a double loop at 496 us.

这篇关于在两个二维数组中查找匹配行的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆