查找数据中的巨大跳跃 [英] Finding very large jumps in data

查看:79
本文介绍了查找数据中的巨大跳跃的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只需要找到很大的跳跃,这样我就可以找到簇,之后也可以找到噪音.样本数据如下:

I need to find very large jumps only so that I can find clusters and later the noise as well. The sample data is as under:

0.000000
0.000500
0.001500
0.003000
0.005500
0.008700
0.012400
0.000000
0.000500
0.001500
0.003000
0.005500
0.008700
0.012400
0.000000
0.000500
0.001500
0.003000
0.005500
0.008700
0.012400
0.000000
0.000500
0.001500
0.003000
0.005500
0.008700
0.012400
0.000000
0.000500
0.001500
0.003000
0.005500
0.008700
0.012400
0.000000
0.000500
0.001500
0.003000
0.005500
0.008700
0.012400
0.012400

我需要在python中执行此操作,但是任何通用算法也将受到欢迎.

I need to do this in python, but any generic algorithm would be welcome as well.

我已经尝试过

  1. 每对连续点之间的查找距离.
  2. 找到连续距离的比率.
  3. 查找连续比率的接近程度.

我遇到的问题是,当我使用比较函数numpy.allclose()时,它的逼近因子是静态的,并且对于不同程度的跳跃,它会停止工作并给出假阳性和假阴性.

The problem I face is when I use the compare function numpy.allclose() , its approximation factor is static and for varying degree of jumps, it stops working and gives false positives and false negatives.

一些用于数据可视化的图形.每个底部的图是总点数.

Some of the graphs for data visualization. The bottom graph in each is the total number of points.

推荐答案

首先,您应该可视化问题以更好地了解正在发生的事情:

First, you should visualise your problem to get a better understanding what's going on:

import matplotlib.pyplot as plt
data = (0.000000, 0.000500, 0.001500, 0.003000, 0.005500, 0.008700,
        0.012400, 0.000000, 0.000500, 0.001500, 0.003000, 0.005500,
        0.008700, 0.012400, 0.000000, 0.000500, 0.001500, 0.003000,
        0.005500, 0.008700, 0.012400, 0.000000, 0.000500, 0.001500,
        0.003000, 0.005500, 0.008700, 0.012400, 0.000000, 0.000500,
        0.001500, 0.003000, 0.005500, 0.008700, 0.012400, 0.000000,
        0.000500, 0.001500, 0.003000, 0.005500, 0.008700, 0.012400, 
        0.012400)
plt.scatter(range(len(data)), data)

第二,您需要实现一个步骤检测,该步骤在Wiki上有很好的描述: http://en.wikipedia.org/wiki/Step_detection

Second, you need to implement a step detection, which is well described on the wiki: http://en.wikipedia.org/wiki/Step_detection

选择一种您认为最合适的方法并尝试使用它.

Choose a method you think would fit best and play around with it.

更新

只是想一想:如果您的所有数据看起来都与您的示例相似,那么您也可以尝试制作锯齿波(

Just a thought: if all your data look similar to your example, you could also simply try to make a sawtooth wave (http://en.wikipedia.org/wiki/Sawtooth_wave) least square fit (http://en.wikipedia.org/wiki/Least_squares) to find the "jumps". This could be a starting point for further analysis.

这篇关于查找数据中的巨大跳跃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆