在嘈杂的数据中寻找山谷 [英] Finding a valley in a noisy data

查看:133
本文介绍了在嘈杂的数据中寻找山谷的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  Date Time_GMTTime_IST当前11/15/2016 5:12:27 10:42:27 26.6111/15/2016 5:12:28 10:42:28 42.2711/15/2016 5:12:29 10:42:29 25.4811/15/2016 5:12:30 10:42:30 24.2411/15/2016 5:12:31 10:42:31 25.9111/15/2016 5:12:32 10:42:32 27.7511/15/2016 5:12:33 10:42:33 24.4611/15/2016 5:12:34 10:42:34 24.3211/15/2016 5:12:35 10:42:35 24.8111/15/2016 5:12:36 10:42:36 27.3611/15/2016 5:12:37 10:42:37 28.211/15/2016 5:12:38 10:42:38 28.2911/15/2016 5:12:39 10:42:39 26.5211/15/2016 5:12:40 10:42:40 32.5811/15/2016 5:12:41 10:42:41 24.2411/15/2016 5:12:42 10:42:42 24.3611/15/2016 5:12:43 10:42:43 26.4811/15/2016 5:12:44 10:42:44 28.7611/15/2016 5:12:45 10:42:45 24.5111/15/2016 5:12:46 10:42:46 23.9311/15/2016 5:12:47 10:42:47 25.232016年11月15日5:12:48 10:42:48 27.911/15/2016 5:12:49 10:42:49 27.8411/15/2016 5:12:50 10:42:50 27.3111/15/2016 5:12:51 10:42:51 29.1711/15/2016 5:12:52 10:42:52 2411/15/2016 5:12:53 10:42:53 32.5111/15/2016 5:12:54 10:42:54 26.6311/15/2016 5:12:55 10:42:55 22.3411/15/2016 5:12:56 10:42:56 29.1411/15/2016 5:12:57 10:42:57 46.6211/15/2016 5:12:58 10:42:58 48.8511/15/2016 5:12:59 10:42:59 30.5911/15/2016 5:13:00 10:43:00 30.6811/15/2016 5:13:01 10:43:01 30.8211/15/2016 5:13:02 10:43:02 31.6411/15/2016 5:13:03 10:43:03 43.91 

上面是一个示例数据,该数据持续了几天.我必须找到当前的低迷状态,如

我用来生成该图像的代码是:

  import matplotlib导入matplotlib.pyplot作为pltc = [26.61、42.27、25.48、24.24、25.91、27.75、24.46、24.32、24.81、27.36、28.2、28.29、26.52、32.58、24.24、24.36、26.48、28.76、24.51、23.93、25.23、27.9、27.84、27.31,29.17、24、32.51、26.63、22.34、29.14、46.62、48.85、30.59、30.68、30.82、31.64、43.91]如果__name__ =='__main__':#选择窗口宽度和阈值窗= 5特雷斯= 27.0#迭代并收集与先前状态有关的状态更改变化= []滚动= [无] *窗口old_state =无对于范围内的i(窗口,len(c)-1):slc = c [i-window:i + 1]平均值= sum(slc)/float(len(slc))如果均值>状态=好";还有其他不好"的地方roll.append(平均值)如果不是old_state或old_state!=状态:print('在位置{:> 3d}({:5.3f})更改为{:> 4s}'.format(state,i,mean))changes.append((i,state))old_state =状态#绘制结果和状态变化plt.figure(frameon = False,figsize =(10,8))当前,= plt.plot(c,ls ='-',label ='Current')rollwndw,= plt.plot(rolling,lw = 2,label ='Rolling Mean')plt.axhline(阈值,xmin = .0,xmax = 1.0,c ='灰色',ls ='-')plt.text(40,thres,'阈值:{:.1f}'.format(thres),horizo​​ntalalignment ='right')对于c,s的更改:plt.axvline(c,ymin = .0,ymax = .7,c ='red',ls ='-')plt.text(c,41.5,s,color ='red',rotation = 90,verticalalignment ='bottom')plt.legend(handles = [currents,rollwndw],fontsize = 11)plt.grid(真)plt.savefig('local/plot.png',dpi = 72,bbox_inches ='tight') 

Date        Time_GMTTime_IST    Current
11/15/2016  5:12:27 10:42:27    26.61
11/15/2016  5:12:28 10:42:28    42.27
11/15/2016  5:12:29 10:42:29    25.48
11/15/2016  5:12:30 10:42:30    24.24
11/15/2016  5:12:31 10:42:31    25.91
11/15/2016  5:12:32 10:42:32    27.75
11/15/2016  5:12:33 10:42:33    24.46
11/15/2016  5:12:34 10:42:34    24.32
11/15/2016  5:12:35 10:42:35    24.81
11/15/2016  5:12:36 10:42:36    27.36
11/15/2016  5:12:37 10:42:37    28.2
11/15/2016  5:12:38 10:42:38    28.29
11/15/2016  5:12:39 10:42:39    26.52
11/15/2016  5:12:40 10:42:40    32.58
11/15/2016  5:12:41 10:42:41    24.24
11/15/2016  5:12:42 10:42:42    24.36
11/15/2016  5:12:43 10:42:43    26.48
11/15/2016  5:12:44 10:42:44    28.76
11/15/2016  5:12:45 10:42:45    24.51
11/15/2016  5:12:46 10:42:46    23.93
11/15/2016  5:12:47 10:42:47    25.23
11/15/2016  5:12:48 10:42:48    27.9
11/15/2016  5:12:49 10:42:49    27.84
11/15/2016  5:12:50 10:42:50    27.31
11/15/2016  5:12:51 10:42:51    29.17
11/15/2016  5:12:52 10:42:52    24
11/15/2016  5:12:53 10:42:53    32.51
11/15/2016  5:12:54 10:42:54    26.63
11/15/2016  5:12:55 10:42:55    22.34
11/15/2016  5:12:56 10:42:56    29.14
11/15/2016  5:12:57 10:42:57    46.62
11/15/2016  5:12:58 10:42:58    48.85
11/15/2016  5:12:59 10:42:59    30.59
11/15/2016  5:13:00 10:43:00    30.68
11/15/2016  5:13:01 10:43:01    30.82
11/15/2016  5:13:02 10:43:02    31.64
11/15/2016  5:13:03 10:43:03    43.91

The above is a sample data, the data goes on for days.I have to find the depression in current as shown in the image. If the current goes below 30 amps for a long time I have to detect that valley-like depression. I have been working on it for a while and I'm not able to think of any logic that can find the solution precicely. Any kind of suggestion is appreciated. A machine learning approach is also accepted.

解决方案

You could just use a moving window average approach:

  1. Select an appropriate window width (in your case, the delta between entries is one second each, so your chosen width will be in dimensions of seconds)

  2. Iterate over your currents column and calculate the average of currents with respect to your chosen window width

  3. Check when it drops below a threshold or raises above it, depending on its prior state

With your example data, this may look like the following. In this plot, your original currents data is depicted as a blue dotted line, the moving average is the thick green line and state changes are marked as red vertical lines.

The code I used to generate that image is:

import matplotlib
import matplotlib.pyplot as plt

c = [26.61, 42.27, 25.48, 24.24, 25.91, 27.75, 24.46, 24.32, 24.81, 27.36, 28.2, 28.29, 26.52, 32.58, 24.24, 24.36, 26.48, 28.76, 24.51, 23.93, 25.23, 27.9, 27.84, 27.31, 29.17, 24, 32.51, 26.63, 22.34, 29.14, 46.62, 48.85, 30.59, 30.68, 30.82, 31.64, 43.91]

if __name__ == '__main__':
    # Choose window width and threshold
    window = 5
    thres = 27.0

    # Iterate and collect state changes with regard to previous state
    changes = []
    rolling = [None] * window
    old_state = None
    for i in range(window, len(c) - 1):
        slc = c[i - window:i + 1]
        mean = sum(slc) / float(len(slc))
        state = 'good' if mean > thres else 'bad'

        rolling.append(mean)
        if not old_state or old_state != state:
            print('Changed to {:>4s} at position {:>3d} ({:5.3f})'.format(state, i, mean))
            changes.append((i, state))
            old_state = state

    # Plot results and state changes
    plt.figure(frameon=False, figsize=(10, 8))
    currents, = plt.plot(c, ls='--', label='Current')
    rollwndw, = plt.plot(rolling, lw=2, label='Rolling Mean')
    plt.axhline(thres, xmin=.0, xmax=1.0, c='grey', ls='-')
    plt.text(40, thres, 'Threshold: {:.1f}'.format(thres), horizontalalignment='right')
    for c, s in changes:
        plt.axvline(c, ymin=.0, ymax=.7, c='red', ls='-')
        plt.text(c, 41.5, s, color='red', rotation=90, verticalalignment='bottom')
    plt.legend(handles=[currents, rollwndw], fontsize=11)
    plt.grid(True)
    plt.savefig('local/plot.png', dpi=72, bbox_inches='tight')

这篇关于在嘈杂的数据中寻找山谷的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆