在二维numpy数组中查找值的索引 [英] Finding indices of values in 2D numpy array

查看:839
本文介绍了在二维numpy数组中查找值的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从numpy数组中获取索引值,而我尝试使用相交却无济于事。我只是想在2个数组中查找类似的值。一个是2D,我正在选择一列,另一个是1D,仅是要搜索的值列表,因此实际上只有2个1D数组。

I'm trying to get the index values out of a numpy array, I've tried using intersects instead to no avail. I'm simply trying to find like values in 2 arrays. One is 2D and I'm selecting a column, and the other is 1D, just a list of values to search for, so effectively just 2 1D arrays.

ll将此数组称为:

 array([[    1, 97553,     1],
       [    1, 97587,     1],
       [    1, 97612,     1],
       [    1, 97697,     1],
       [    1, 97826,     3],
       [    1, 97832,     1],
       [    1, 97839,     1],
       [    1, 97887,     1],
       [    1, 97944,     1],
       [    1, 97955,     2]])

我们正在搜索说, values = numpy.array([97612 ,97633,97697,97999,97943,97944])

所以我尝试:

numpy.where(a[:, 1] == values)

并且我期望一堆值的索引,但是取而代之的是我得到一个空数组,它吐出 [(array([],dtype = int64),) ]

And I'd expect a bunch of indices of the values, but instead I get back an array that's empty, it spits out [(array([], dtype=int64),)].

如果我尝试过此操作:

numpy.where(a[:, 1] == 97697)

它给了我(array([2]),),这是我期望的。

It gives me back (array([2]),), which is what I would expect.

我在这里缺少什么奇怪的数组?还是有可能更简单的方法来做到这一点?如我所料,查找数组索引和匹配数组似乎不起作用。当我想通过整数或唯一值查找数组的并集或相交时,它似乎无法正常工作。任何帮助将是超级。谢谢。

What weirdness of arrays am I missing here? Or is there maybe even an easier way to do this? Finding array indices and matching arrays seems to not work as I expect at all. When I want to find the unions or intersects of arrays, by indice or unique value it just doesn't seem to function. Any help would be super. Thanks.

编辑:
根据沃伦斯的要求:

As per Warrens request:

import numpy

a = numpy.array([[    1, 97553,     1],
       [    1, 97587,     1],
       [    1, 97612,     1],
       [    1, 97697,     1],
       [    1, 97826,     3],
       [    1, 97832,     1],
       [    1, 97839,     1],
       [    1, 97887,     1],
       [    1, 97944,     1],
       [    1, 97955,     2]])

values = numpy.array([97612, 97633, 97697, 97999, 97943, 97944])

I已经发现 numpy.in1d 将为我提供该操作的正确布尔值真值表,其一维数组的长度应映射到原始数据。现在我唯一的问题是如何处理,例如删除或修改那些索引处的原始数组。我可以用一个循环努力地完成它,但是据我所知,numpy中有更好的方法。从我已经找到的内容来看,使用numpy真值表作为掩码应该是非常强大的。

I've found that numpy.in1d will give me a correct truth table of booleans for the operation, with a 1d array of the same length that should map to the original data. My only issue here is now how to act with that, for instance deleting or modifying the original array at those indices. I could do it laboriously with a loop, but as far as I know there are better ways in numpy. Truth tables as masks are supposed to be quite powerful with numpy from what I have been able to find.

推荐答案

< a href = https://docs.scipy.org/doc/numpy/reference/generated/numpy.where.html rel = noreferrer> np.where 带有单个参数的 等效于 np.nonzero 。它为您提供条件(输入数组)为 True 的索引。

np.where with a single argument is equivalent to np.nonzero. It gives you the indices where a condition, the input array, is True.

在您的示例中,您正在检查 a [:,1] values

In your example you are checking for element-wise equality between a[:,1] and values

a[:, 1] == values
False

所以它给您正确的结果:输入中没有索引为 True

So it's giving you the correct result: no index in the input is True.

您应使用 np。 isin 代替

np.isin(a[:,1], values)
array([False, False,  True,  True, False, False, False, False,  True, False], dtype=bool)

现在您可以使用 np.where 以获取索引

Now you can use np.where to get the indices

np.where(np.isin(a[:,1], values))
(array([2, 3, 8]),)

并使用它们来寻址原始数组

and use those to address the original array

a[np.where(np.isin(a[:,1], values))]    
array([[    1, 97612,     1],
       [    1, 97697,     1],
       [    1, 97944,     1]])

使用简单的相等性检查的初始解决方案确实可以与正确的 广播

Your initial solution with a simple equality check could indeed have worked with proper broadcasting:

np.where(a[:,1] == values[..., np.newaxis])[1]
array([2, 3, 8])






编辑鉴于您似乎在使用上述结果进行索引时遇到问题并操作数组,下面是几个简单的例子

现在,您应该有两种访问原始数组中匹配元素的方法,即二进制掩码或来自 np.where 的索引。

Now you should have two ways of accessing your matching elements in the original array, either the binary mask or the indices from np.where.

mask = np.isin(a[:,1], values)  # np.in1d if np.isin is not available
idx = np.where(mask)

假设您要将所有匹配的行都设置为零

Let's say you want to set all matching rows to zero

a[mask] = 0   # or a[idx] = 0
array([[    1, 97553,     1],
       [    1, 97587,     1],
       [    0,     0,     0],
       [    0,     0,     0],
       [    1, 97826,     3],
       [    1, 97832,     1],
       [    1, 97839,     1],
       [    1, 97887,     1],
       [    0,     0,     0],
       [    1, 97955,     2]])

或者您想将匹配行的第三列乘以 100

Or you want to multiply the third column of matching rows by 100

a[mask, 2] *= 100
array([[    1, 97553,     1],
       [    1, 97587,     1],
       [    1, 97612,   100],
       [    1, 97697,   100],
       [    1, 97826,     3],
       [    1, 97832,     1],
       [    1, 97839,     1],
       [    1, 97887,     1],
       [    1, 97944,   100],
       [    1, 97955,     2]])

或者您要删除匹配的行(此处使用索引比掩码更方便)

Or you want to delete matching rows (here using indices is more convenient than masks)

np.delete(a, idx, axis=0)
array([[    1, 97553,     1],
       [    1, 97587,     1],
       [    1, 97826,     3],
       [    1, 97832,     1],
       [    1, 97839,     1],
       [    1, 97887,     1],
       [    1, 97955,     2]])

这篇关于在二维numpy数组中查找值的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆