Python:精确定位坡度的线性部分 [英] Python: Pinpointing the Linear Part of a Slope

查看:134
本文介绍了Python:精确定位坡度的线性部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几幅如下图所示:

我想知道有什么方法可以找到x轴的大约5.5到8之间的斜率.在有多个这样的图的地方,我还想知道是否有一种方法可以自动找到斜率值.

I am wondering what kind of methods there might be for finding the slope between approximately 5.5 and 8 for the x-axis. Where there are several plots like this, I am moreso wondering if there is a way to automatically find the slope value.

有什么建议吗?

我正在考虑ployfit()或线性回归.问题是我不确定如何自动查找值.

I am thinking ployfit(), or a linear regression. The problem is that I am unsure of how to find the values automatically.

推荐答案

在数据集中查找线性部分的一种通用方法是计算函数的二阶导数,并查看它在哪里(接近)为零.通往解决方案的过程中要考虑以下几件事:

A generic way to find linear parts in data sets is to calculate the second derivative of the function, and see where it is (close to) zero. There are several things to consider on the way to the solution:

  • 如何计算噪声数据的二阶导数?一种快速,简单的方法可以对卷积核进行卷积,卷积核等于高斯二阶导数,可以轻松地适应不同的噪声水平,数据集大小和线性补丁的预期长度.可调部分是内核的宽度.

  • How to calculate the second derivative of noisy data? One fast and simple method, that can easily be adapted to different noise levels, data set sizes and expected lengths of the linear patch, is to convolve the data with a convolution kernel equal to the second derivative of a Gaussian. The adjustable part is the width of the kernel.

在您的上下文中,接近零"是什么意思?要回答这个问题,您必须对数据进行实验.

What does "close to zero" mean in your context? To answer this question, you have to experiment with your data.

该方法的结果可用作上述chi ^ 2-方法的输入,以识别数据集中的候选区域.

The results of this method could be used as an input to the chi^2-method described above, to identify candidate regions in the data set.

以下一些可帮助您入门的源代码:

Here some source code that will get you started:

from matplotlib import pyplot as plt

import numpy as np

# create theoretical data
x_a = np.linspace(-8,0, 60)
y_a = np.sin(x_a)
x_b = np.linspace(0,4,30)[1:]
y_b = x_b[:]
x_c = np.linspace(4,6,15)[1:]
y_c = np.sin((x_c - 4)/4*np.pi)/np.pi*4. + 4
x_d = np.linspace(6,14,120)[1:]
y_d = np.zeros(len(x_d)) + 4 + (4/np.pi)

x = np.concatenate((x_a, x_b, x_c, x_d))
y = np.concatenate((y_a, y_b, y_c, y_d))


# make noisy data from theoretical data
y_n = y + np.random.normal(0, 0.27, len(x))

# create convolution kernel for calculating
# the smoothed second order derivative
smooth_width = 59
x1 = np.linspace(-3,3,smooth_width)
norm = np.sum(np.exp(-x1**2)) * (x1[1]-x1[0]) # ad hoc normalization
y1 = (4*x1**2 - 2) * np.exp(-x1**2) / smooth_width *8#norm*(x1[1]-x1[0])



# calculate second order deriv.
y_conv = np.convolve(y_n, y1, mode="same")

# plot data
plt.plot(x,y_conv, label = "second deriv")
plt.plot(x, y_n,"o", label = "noisy data")
plt.plot(x, y, label="theory")
plt.plot(x, x, "0.3", label = "linear data")
plt.hlines([0],-10, 20)
plt.axvspan(0,4, color="y", alpha=0.2)
plt.axvspan(6,14, color="y", alpha=0.2)
plt.axhspan(-1,1, color="b", alpha=0.2)
plt.vlines([0, 4, 6],-10, 10)
plt.xlim(-2.5,12)
plt.ylim(-2.5,6)
plt.legend(loc=0)
plt.show()

这是结果:

smooth_width是卷积内核的宽度.为了调整噪声量,请将random.normal中的0.27值更改为不同的值.而且请注意,这种方法在靠近数据空间边界的地方效果不佳.

smooth_width is the width of the convolution kernel. In order to adjust the amount of noise, change the value 0.27 in random.normal to different values. And please note, that this method does not work well close to the border of the data space.

如您所见,二阶导数(蓝线)对接近零"的要求对于数据呈线性的黄色部分非常适用.

As you can see, the "close-to-zero" requirement for the second derivative (blue line) holds quite well for the yellow parts, where the data is linear.

这篇关于Python:精确定位坡度的线性部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆