如何检测 Pandas 中时间序列图的突然变化 [英] How to detect a sudden change in a time series plot in Pandas

查看:41
本文介绍了如何检测 Pandas 中时间序列图的突然变化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图检测"一个系列中速度的突然下降,但我不知道如何捕捉它.详细信息和代码如下:

I am trying to "detect" a sudden drop in velocity in a series and I'm not sure how to capture it. The details and code are below:

这是我拥有的系列片段以及用于生成它的代码:

This is a snippet of the Series that I have along with the code to produce it:

velocity_df.velocity.car1

Index   velocity
200     17.9941
201     17.9941
202     18.4031
203     18.4031

这是整个系列的图

我正在尝试检测从 220 到 230-40 的突然下降,并将其保存为如下所示的系列:

I'm trying to detect the sudden drop from 220 to 230-40 and save that out as a Series that looks like this:

Index   velocity
220      14.927
221      14.927
222      14.927
223      14.927
224      14.518
225      14.518
226     16.1538
227     12.2687
228     9.20155
229     6.33885
230     4.49854

我只是想在速度突然下降时捕捉一个大概的范围,以便使用其他功能.

I'm just trying to capture an approximate range when there is a sudden decrease in speed so as to use other features.

如果我可以添加任何其他信息,请告诉我.谢谢!

If I can add any additional information, please let me know. Thank you!

推荐答案

如果你想一个一个比较两个值,这将是一个简单的方法:

This would be a simple approach, if you want to compare two values one by one:

鉴于您的问题中名为 s 的系列,您可以通过将其减去 1 来构造数据的绝对离散导数:

Given the series from your question, called s you can construct the absolute discrete derivative of your data by subtracting it with a shift of 1:

d = pd.Series(s.values[1:] - s.values[:-1], index=s.index[:-1]).abs()

如果您现在取该系列绝对差值的最大值 m,您可以将其乘以一个介于 0 和 1 之间的因子 a 作为阈值:

If you now take the maximum m of that series of absolute differences, you can multiply it with a factor a between 0 and 1 as a threshold:

a = .7
m = d.max()
print(d > m * a)

最后一行输出匹配的索引.

The last line outputs the indices of the matches.

以此为基础,您可以使用滑动窗口技术,例如 核密度估计或 Parzen 窗口 创建更流畅的结果:

Building up on this, you could use a sliding window technique such as kernel density estimation, or Parzen window to create more smooth results:

r = d.rolling(3, min_periods=1, win_type='parzen').sum()
n = r.max()

就像之前我们可以打印出匹配的元素

Like before we can print out the matching elements

print(r > n * a)

给出以下输出

Index
220    False
221    False
222    False
223    False
224    False
225    False
226    False
227     True
228     True
229     True
dtype: bool

这篇关于如何检测 Pandas 中时间序列图的突然变化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆