Numpy查找具有相同值的组的索引 [英] Numpy find indices of groups with same value
问题描述
我有一个由 0 和 1 组成的 numpy 数组:
I have a numpy array of zeros and ones:
y=[1,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,1,1,1,1]代码>
我想计算一组 1(或 0)的索引.所以对于上面的例子,一组的结果应该类似于:
I want to calculate the indices of groups of ones (or zeros). So for the above example the result for groups of ones should be something similar to:
result=[(0,2), (8,9), (16,19)]
(如何)我可以用 numpy 做到这一点吗?我发现没有比分组功能更像的东西.
(How) Can I do that with numpy? I found nothing like a group-by function.
我尝试了 np.ediff1d,但想不出一个好的解决方案.并不是说数组可能会或可能不会以一组数组开始/结束:
I experimented around with np.ediff1d, but couldn't figure out a good solution. Not that the array may or may not begin/end with a group of ones:
import numpy as np
y = [1,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,1,1,1,1]
mask = np.ediff1d(y)
starts = np.where(mask > 0)
ends = np.where(mask < 0)
我还在这里找到了部分解决方案:查找元素改变值 numpy 的索引
I also found a partial solution here: Find index where elements change value numpy
但那个只给了我值变化的索引.
But that one only gives me the indices where the values change.
推荐答案
我们可以做这样的事情,适用于任何通用数组 -
We can do something like this that works for any generic array -
def islandinfo(y, trigger_val, stopind_inclusive=True):
# Setup "sentients" on either sides to make sure we have setup
# "ramps" to catch the start and stop for the edge islands
# (left-most and right-most islands) respectively
y_ext = np.r_[False,y==trigger_val, False]
# Get indices of shifts, which represent the start and stop indices
idx = np.flatnonzero(y_ext[:-1] != y_ext[1:])
# Lengths of islands if needed
lens = idx[1::2] - idx[:-1:2]
# Using a stepsize of 2 would get us start and stop indices for each island
return list(zip(idx[:-1:2], idx[1::2]-int(stopind_inclusive))), lens
样品运行 -
In [320]: y
Out[320]: array([1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1])
In [321]: islandinfo(y, trigger_val=1)[0]
Out[321]: [(0, 2), (8, 9), (16, 19)]
In [322]: islandinfo(y, trigger_val=0)[0]
Out[322]: [(3, 7), (10, 15)]
或者,我们可以使用 diff
来获得切片比较,然后简单地用 2
列重新整形以替换步长切片以给自己一个单行 -
Alternatively, we can use diff
to get the sliced comparisons and then simply reshape with 2
columns to replace the step-sized slicing to give ourselves a one-liner -
In [300]: np.flatnonzero(np.diff(np.r_[0,y,0])!=0).reshape(-1,2) - [0,1]
Out[300]:
array([[ 0, 2],
[ 8, 9],
[16, 19]])
这篇关于Numpy查找具有相同值的组的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!