在 pandas 数据框中查找所有模式的索引 [英] Finding Index of All Patterns Within Pandas Dataframe

查看:41
本文介绍了在 pandas 数据框中查找所有模式的索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用按日期时间索引的Pandas数据框,如下所示:

I'm using a Pandas dataframe indexed by datetimes that looks something like this:

TimeSys_Index
2014-08-29 00:00:18    0
2014-08-29 00:00:19    0
2014-08-29 00:00:20    1
2014-08-29 00:00:21    1
2014-08-29 00:00:22    0
2014-08-29 00:00:23    0
2014-08-29 00:00:24    0
2014-08-29 00:00:25    0
2014-08-29 00:00:26    0
2014-08-29 00:00:27    1
2014-08-29 00:00:28    1
2014-08-29 00:00:29    1
2014-08-29 00:00:30    1
2014-08-29 00:00:31    0
2014-08-29 00:00:32    0
2014-08-29 00:00:33    0
...

我想为模式[0,0,1,1]的每次出现找到索引(时间).使用以上序列,我希望它返回['2014-08-29 00:00:18','2014-08-29 00:00:25'].更重要的是,这需要向量化或至少非常快.

I want to find the index (time) for every occurrence of the pattern [0, 0, 1, 1]. Using the above sequence I'd like it to return ['2014-08-29 00:00:18', '2014-08-29 00:00:25']. The kicker is this needs to be vectorized or at least very quick.

我当时正在考虑将整个向量与模式向量进行关联,并找到所得向量等于4的索引,但是必须有一种更简单的方法.

I was thinking of running a correlation of the full vector with the pattern vector and finding the indices where the resulting vector equals 4, but there's got to be a simpler way.

推荐答案

您可以查看移位后的值:

You can look at the shifted values:

>>> df.head()
                     val
TimeSys_Index           
2014-08-29 00:00:18    0
2014-08-29 00:00:19    0
2014-08-29 00:00:20    1
2014-08-29 00:00:21    1
2014-08-29 00:00:22    0
>>> i = (df['val'] == 0) & (df['val'].shift(-1) == 0)
>>> i &= (df['val'].shift(-2) == 1) & (df['val'].shift(-3) == 1)
>>> df.index[i]
<class 'pandas.tseries.index.DatetimeIndex'>
[2014-08-29 00:00:18, 2014-08-29 00:00:25]
Length: 2, Freq: None, Timezone: None

这篇关于在 pandas 数据框中查找所有模式的索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆