以编程方式查找数据集中的“峰值"或下降 [英] Finding a 'spike' or drop in a dataset programatically
问题描述
如果我有一个看起来像这样的数据集
If I have a dataset that looks like this
[0.523,0.445,0.558,0.492,0.440,0.502,0.742,0.802,0.821,0.811,0.804,0.860]
如您所见,在 0.502 之后的值中有一个尖峰".有没有办法在 Python 中以编程方式找到它?我已经在使用 Numpy 和 Scipy;我确信那些库包含这样的东西.我只是不知道这个程序叫什么.
As you can see, there is a 'spike' in the values after 0.502. Is there a way to find this programmatically in Python? I'm already using Numpy and Scipy; I'm sure those libraries contain something like this. I just don't know what this procedure is called.
一个额外的好处是调整检测尖峰或下降的灵敏度",因为数据集可能非常嘈杂.峰值意味着值的移动平均值持续增加,下降意味着值的持续下降.
An added bonus would be to adjust the 'sensitivity' of detecting a spike or drop, since the dataset can be quite noise. A spike would mean a sustained increase in the moving averages of the values, and a drop would mean a sustained decrease values.
每个值的范围是 [-1,1].数组中值的数量为 50-100.
The range of each value is [-1,1]. The number of values in the array would be 50-100.
推荐答案
我建议使用 numpy 的 diff 函数:
I would recommend using the diff function of numpy:
import numpy
a = [0.523,0.445,0.558,0.492,0.440,0.502,0.742,0.802,0.821,0.811,0.804,0.860]
numpy.diff(a)
这会给你:
array([-0.078, 0.113, -0.066, -0.052, 0.062, 0.24 , 0.06 , 0.019,
-0.01 , -0.007, 0.056])
如果数字是正数,则上升,如果是负数,则下降.
If the number is positive, then it's a jump up, if it's negative, then it's a jump down.
如果你只是想找出哪里有尖峰,向上或向下试试这个:
If you just want to find where there are spikes, up or down try this:
abs(numpy.diff(a)) > 0.2
向上或向下调整 0.2 会分别降低或提高敏感度.这会给:
Adjusting the 0.2 up or down would make it less or more sensitive, respectively. This would give:
array([False, False, False, False, False, True, False, False, False,
False, False], dtype=bool)
这篇关于以编程方式查找数据集中的“峰值"或下降的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!