如何识别布尔数组中的值序列? [英] How do I identify sequences of values in a boolean array?

查看:68
本文介绍了如何识别布尔数组中的值序列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很长的布尔数组:

I have a long boolean array:

bool_array = [ True, True, True, True, True, False, False, False, False, False, True, True, True, False, False, True, True, True, True, False, False, False, False, False, False, False ]

我需要弄清楚值在哪里翻转,即TrueFalse序列的起始地址.在这种情况下,我想得到

I need to figure out where the values flips, i.e., the addresses where sequences of True and False begin. In this particular case, I would want to get

index = [0, 5, 10, 13, 15, 19, 26]

是否有一种简便的方法而无需手动循环以检查第(i + 1)个第ith个元素?

Is there an easy way to do without manually looping to check every ith element with the (i+1)th?

推荐答案

作为大型数据集的一种更有效的方法,在python 3.x中,您可以使用

As a more efficient approach for large datasets, in python 3.X you can use accumulate and groupby function from itertools module.

>>> from itertools import accumulate, groupby
>>> [0] + list(accumulate(sum(1 for _ in g) for _,g in groupby(bool_array)))
[0, 5, 10, 13, 15, 19, 26]


代码背后的逻辑:


The logic behind the code:

此代码使用groupby()函数对连续重复项进行分类,然后循环遍历groupby()返回的迭代器,该迭代器包含成对的键(我们使用下划线而不是抛弃变量对它进行了转义),并且这些分类的迭代器.

This code, categorizes the sequential duplicate items using groupby() function, then loops over the iterator returned by groupby() which is contains pairs of keys (that we escaped it using under line instead of a throw away variable) and these categorized iterators.

>>> [list(g) for _, g in groupby(bool_array)]
[[True, True, True, True, True], [False, False, False, False, False], [True, True, True], [False, False], [True, True, True, True], [False, False, False, False, False, False, False]]

因此,我们所需要做的就是计算这些迭代器的长度,并将每个长度与其先前的长度相加,以获取第一个项目的索引,该索引正是该项目被更改的地方,而正是accumulate()函数的作用是为了.

So all we need is calculating the length of these iterators and sum each length with its previous length, in order to get the index of first item which is exactly where the item is changed, that is exactly what that accumulate() function is for.

在Numpy中,您可以使用以下方法:

In Numpy you can use the following approach:

In [19]: np.where(arr[1:] - arr[:-1])[0] + 1
Out[19]: array([ 5, 10, 13, 15, 19])
# With leading and trailing indices
In [22]: np.concatenate(([0], np.where(arr[1:] - arr[:-1])[0] + 1, [arr.size]))
Out[22]: array([ 0,  5, 10, 13, 15, 19, 26])

这篇关于如何识别布尔数组中的值序列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆