计算NumPy数组中的连续1 [英] Counting consecutive 1's in NumPy array
本文介绍了计算NumPy数组中的连续1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
[1, 1, 1, 0, 0, 0, 1, 1, 0, 0]
我有一个NumPy数组,由上面的0和1组成.如何添加以下所有连续的1?每当我遇到0时,我都会重置.
I have a NumPy array consisting of 0's and 1's like above. How can I add all consecutive 1's like below? Any time I encounter a 0, I reset.
[1, 2, 3, 0, 0, 0, 1, 2, 0, 0]
我可以使用for循环来执行此操作,但是是否有使用NumPy的矢量化解决方案?
I can do this using a for loop, but is there a vectorized solution using NumPy?
推荐答案
这是一种矢量化方法-
def island_cumsum_vectorized(a):
a_ext = np.concatenate(( [0], a, [0] ))
idx = np.flatnonzero(a_ext[1:] != a_ext[:-1])
a_ext[1:][idx[1::2]] = idx[::2] - idx[1::2]
return a_ext.cumsum()[1:-1]
样品运行-
In [91]: a = np.array([1, 1, 1, 0, 0, 0, 1, 1, 0, 0])
In [92]: island_cumsum_vectorized(a)
Out[92]: array([1, 2, 3, 0, 0, 0, 1, 2, 0, 0])
In [93]: a = np.array([0, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1])
In [94]: island_cumsum_vectorized(a)
Out[94]: array([0, 1, 2, 3, 4, 0, 0, 0, 1, 2, 0, 0, 1])
运行时测试
对于时间安排,我将使用OP的示例输入数组并重复/平铺它,希望这应该是
For the timings , I would use OP's sample input array and repeat/tile it and hopefully this should be a less opportunistic benchmark
-
小案例:
In [16]: a = np.array([1, 1, 1, 0, 0, 0, 1, 1, 0, 0])
In [17]: a = np.tile(a,10) # Repeat OP's data 10 times
# @Paul Panzer's solution
In [18]: %timeit np.concatenate([np.cumsum(c) if c[0] == 1 else c for c in np.split(a, 1 + np.where(np.diff(a))[0])])
10000 loops, best of 3: 73.4 µs per loop
In [19]: %timeit island_cumsum_vectorized(a)
100000 loops, best of 3: 8.65 µs per loop
大写字母:
In [20]: a = np.array([1, 1, 1, 0, 0, 0, 1, 1, 0, 0])
In [21]: a = np.tile(a,1000) # Repeat OP's data 1000 times
# @Paul Panzer's solution
In [22]: %timeit np.concatenate([np.cumsum(c) if c[0] == 1 else c for c in np.split(a, 1 + np.where(np.diff(a))[0])])
100 loops, best of 3: 6.52 ms per loop
In [23]: %timeit island_cumsum_vectorized(a)
10000 loops, best of 3: 49.7 µs per loop
不,我要一个大案子:
In [24]: a = np.array([1, 1, 1, 0, 0, 0, 1, 1, 0, 0])
In [25]: a = np.tile(a,100000) # Repeat OP's data 100000 times
# @Paul Panzer's solution
In [26]: %timeit np.concatenate([np.cumsum(c) if c[0] == 1 else c for c in np.split(a, 1 + np.where(np.diff(a))[0])])
1 loops, best of 3: 725 ms per loop
In [27]: %timeit island_cumsum_vectorized(a)
100 loops, best of 3: 7.28 ms per loop
这篇关于计算NumPy数组中的连续1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文