如何计算numpy中的连续数字 [英] How to count continuous numbers in numpy

查看：68 发布时间：2021/11/18 4:28:49 python arrays numpy

本文介绍了如何计算numpy中的连续数字的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个 1 和 0 的 Numpy 一维数组.例如

I have a Numpy one-dimensional array of 1 and 0. for e.g

a = np.array([0,1,1,1,0,0,0,0,0,0,0,1,0,1,1,0,0,0,1,1,0,0])

我想计算数组中连续的 0 和 1 并输出类似这样的内容

I want to count the continuous 0s and 1s in the array and output something like this

[1,3,7,1,1,2,3,2,2]

我所做的atm是

np.diff(np.where(np.abs(np.diff(a)) == 1)[0])

它输出

array([3, 7, 1, 1, 2, 3, 2])

如您所见，它缺少第一个计数 1.

as you can see it is missing the first count 1.

我试过 np.split 然后得到每个段的大小，但它似乎并不乐观.

I've tried np.split and then get the sizes of each segments but it does not seem to be optimistic.

有没有更优雅的pythonic"解决方案?

Is there more elegant "pythonic" solution?

推荐答案

这是一种矢量化方法 -

Here's one vectorized approach -

np.diff(np.r_[0,np.flatnonzero(np.diff(a))+1,a.size])

样品运行 -

In [208]: a = np.array([0,1,1,1,0,0,0,0,0,0,0,1,0,1,1,0,0,0,1,1,0,0])

In [209]: np.diff(np.r_[0,np.flatnonzero(np.diff(a))+1,a.size])
Out[209]: array([1, 3, 7, 1, 1, 2, 3, 2, 2])

使用 boolean 连接更快 -

Faster one with boolean concatenation -

np.diff(np.flatnonzero(np.concatenate(([True], a[1:]!= a[:-1], [True] ))))

运行时测试

对于设置，让我们创建一个更大的数据集，其中包含 0s 和 1s 岛，为了与给定样本进行公平的基准测试，让岛长度在1 和 7 -

For the setup, let's create a bigger dataset with islands of 0s and 1s and for a fair benchmarking as with the given sample, let's have the island lengths vary between 1 and 7 -

In [257]: n = 100000 # thus would create 100000 pair of islands

In [258]: a = np.repeat(np.arange(n)%2, np.random.randint(1,7,(n)))

# Approach #1 proposed in this post
In [259]: %timeit np.diff(np.r_[0,np.flatnonzero(np.diff(a))+1,a.size])
100 loops, best of 3: 2.13 ms per loop

# Approach #2 proposed in this post
In [260]: %timeit np.diff(np.flatnonzero(np.concatenate(([True], a[1:]!= a[:-1], [True] ))))
1000 loops, best of 3: 1.21 ms per loop

# @Vineet Jain's soln    
In [261]: %timeit [ sum(1 for i in g) for k,g in groupby(a)]
10 loops, best of 3: 61.3 ms per loop

这篇关于如何计算numpy中的连续数字的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何计算numpy中的连续数字 [英] How to count continuous numbers in numpy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何计算numpy中的连续数字 [英] How to count continuous numbers in numpy

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭