检测数据集中线性行为的算法 [英] Algorithm to detect a linear behaviour in a data set

查看：59 发布时间：2020/5/6 13:59:46 matlab

本文介绍了检测数据集中线性行为的算法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

一段时间前，我发布了一个有关要对数据集的一部分进行多项式拟合的算法的问题并收到一些建议来做我想做的事.但是现在我面临另一个问题，我尝试应用答案中建议的想法. 我的目标是找到数据集的最佳线性拟合，其中只有一部分是线性的.

I have posted a question about an Algorithm to make a polynomial fit of a part of a data set some time ago and received some propositions to do what I wanted. But I face another problem now I try to apply the ideas suggested in the answers. My goal was to find the best linear fit of a data set, in which only a part of it was linear.

以下是我必须执行的操作的示例:

Here is an example of what I must do :

我们有这两个数据集，我必须对虚线左侧的数据的线性部分进行线性趋势处理.红色表示理想的数据集，该数据集从开始到虚线都具有线性部分.蓝色为有问题"的数据集，其处于平稳状态.粗体部分是我必须用来对数据进行线性拟合的部分.

We have these two data sets, and I must make a linear trend of the linear part of the data that is at the left of the dashed line. In red, we have the ideal data set, that has a linear part from the beginning until the dashed line. In blue, we have the 'problematic' data set, that has a plateau. The bold part is the part that I have to use to do the linear fit of the data.

我的问题是，我试图按照上面链接的问题中所述进行操作:我找到了平滑数据的二阶导数，并查看了该数据的足够接近" 0时的情况.但这是我针对有问题的结果数据集(第一张图片)和理想数据集(第二张图片):

My problem is that I tried to do as mentionned in the question linked above : I found the second order derivative of the smoothed data and looked when it was not 'close enough' of 0. But here are my results for the problematic data set (first image) and for the ideal data set (second image) :

(抱歉，质量，我不知道为什么它是如此模糊) 在这两个图像上，我绘制了一阶导数，并绘制了红色的二阶导数.在第一张图片上，我们看到了二阶导数值的峰值.但是问题在于，峰值不是很高，因此很难建立一个阈值来判断该集合是否线性...相反，一阶导数的峰值非常高，因此容易在视觉上看到.

(Sorry for quality, I don't know why it is so blurred) On both images, I plotted the first order derivative and in red, the second order derivative. On the first image, we see peaks of second derivative values. But the problem is that the peaks are not very 'high', making it difficult to establish a threshold that would tell if the set is linear or not... On the contrary, the peak of the first derivative is quite high, making it easy to see visually.

我认为计算一阶导数值的平均值，然后看该值与平均值之间的差异是否足够……但是当我取一阶导数值的平均值以看看这些值与平均值有何不同，由于峰值而存在某种偏移.

I thought that calculate the mean of the values of the first derivative and look when the value differ too much from the mean value would be enough... But when I take the mean of the values of the first derivative in order to see where the values differ from the mean value, there is a sort of offset due to the peak.

如何删除此偏移量，以便仅获取右侧数据的平均值(在图像1上看到的不连续性左侧的数据可能是非线性的，也可能是线性但与峰值右侧的值有不同！)

How to remove this offset in order to take only the mean value of the data at the right (the data at the left of the discontinuity that is seen on Image 1 could be non linear or be linear but have a different value from the values at the right!) of the peak efficiently ?

检测数据集中线性行为的算法 [英] Algorithm to detect a linear behaviour in a data set

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

检测数据集中线性行为的算法 [英] Algorithm to detect a linear behaviour in a data set

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭