删除NumPy数组中的连续重复项 [英] Remove consecutive duplicates in a NumPy array
问题描述
我想删除彼此重复的重复项,但不删除整个数组中的重复项.另外,我想保持顺序不变.
I would like to remove duplicates which follow each other, but not duplicates along the whole array. Also, I want to keep the ordering unchanged.
因此,如果输入为[0 0 1 3 2 2 3 3]
,则输出应为[0 1 3 2 3]
So if the input is [0 0 1 3 2 2 3 3]
the output should be [0 1 3 2 3]
我找到了使用itertools.groupby()
的方法,但是我正在寻找一种更快的NumPy解决方案.
I found a way using itertools.groupby()
but I am looking for a faster NumPy solution.
推荐答案
a[np.insert(np.diff(a).astype(np.bool), 0, True)]
Out[99]: array([0, 1, 3, 2, 3])
通常的想法是使用diff
查找数组中两个连续元素之间的差异.然后,我们仅索引那些给出non-zero
差异元素的元素.但是,由于diff
的长度缩短了1.因此,在建立索引之前,我们需要insert
True
到diff数组的开头.
The general idea is to use diff
to find the difference between two consecutive elements in the array. Then we only index those which give non-zero
differences elements. But since the length of diff
is shorter by 1. So before indexing, we need to insert
the True
to the beginning of the diff array.
说明:
In [100]: a
Out[100]: array([0, 0, 1, 3, 2, 2, 3, 3])
In [101]: diff = np.diff(a).astype(np.bool)
In [102]: diff
Out[102]: array([False, True, True, True, False, True, False], dtype=bool)
In [103]: idx = np.insert(diff, 0, True)
In [104]: idx
Out[104]: array([ True, False, True, True, True, False, True, False], dtype=bool)
In [105]: a[idx]
Out[105]: array([0, 1, 3, 2, 3])
这篇关于删除NumPy数组中的连续重复项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!