拟合由两种不同状态组成的数据的曲线 [英] Fit a curve for data made up of two distinct regimes
问题描述
我正在寻找一种通过一些实验数据绘制曲线的方法.数据显示了一个小线性区域,梯度较浅,在阈值之后是陡峭的线性区域.
I'm looking for a way to plot a curve through some experimental data. The data shows a small linear regime with a shallow gradient, followed by a steep linear regime after a threshold value.
我的数据在这里:http://pastebin.com/H4NSbxqr
My data is here: http://pastebin.com/H4NSbxqr
我可以相对容易地用两条线拟合数据,但我想用一条连续线理想地拟合 - 它应该看起来像两条线,在阈值附近有一条平滑的曲线将它们连接起来(数据中的~5000,如图所示)以上).
I could fit the data with two lines relatively easily, but I'd like to fit with a continuous line ideally - which should look like two lines with a smooth curve joining them around the threshold (~5000 in the data, shown above).
我尝试使用 scipy.optimize
curve_fit
并尝试了一个包含直线和指数之和的函数:
I attempted this using scipy.optimize
curve_fit
and trying a function which included the sum of a straight line and an exponential:
y = a*x + b + c*np.exp((x-d)/e)
尽管尝试了很多次,也没有找到解决方案.
although despite numerous attempts, it didn't find a solution.
如果有人对拟合分布/方法的选择或curve_fit
实现有任何建议,我们将不胜感激.
If anyone has any suggestions please, either on the choice of fitting distribution / method or the curve_fit
implementation, they would be greatly appreciated.
推荐答案
如果您没有特别的理由相信线性 + 指数是数据的真正根本原因,那么我认为适合两条线最有意义.您可以通过将拟合函数设为两行的最大值来实现,例如:
If you don't have a particular reason to believe that linear + exponential is the true underlying cause of your data, then I think a fit to two lines makes the most sense. You can do this by making your fitting function the maximum of two lines, for example:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def two_lines(x, a, b, c, d):
one = a*x + b
two = c*x + d
return np.maximum(one, two)
那么,
x, y = np.genfromtxt('tmp.txt', unpack=True, delimiter=',')
pw0 = (.02, 30, .2, -2000) # a guess for slope, intercept, slope, intercept
pw, cov = curve_fit(two_lines, x, y, pw0)
crossover = (pw[3] - pw[1]) / (pw[0] - pw[2])
plt.plot(x, y, 'o', x, two_lines(x, *pw), '-')
如果你真的想要一个连续且可微的解,我突然想到双曲线有一个急弯,但它必须旋转.实现起来有点困难(也许有更简单的方法),但这里有一个方法:
If you really want a continuous and differentiable solution, it occurred to me that a hyperbola has a sharp bend to it, but it has to be rotated. It was a bit difficult to implement (maybe there's an easier way), but here's a go:
def hyperbola(x, a, b, c, d, e):
""" hyperbola(x) with parameters
a/b = asymptotic slope
c = curvature at vertex
d = offset to vertex
e = vertical offset
"""
return a*np.sqrt((b*c)**2 + (x-d)**2)/b + e
def rot_hyperbola(x, a, b, c, d, e, th):
pars = a, b, c, 0, 0 # do the shifting after rotation
xd = x - d
hsin = hyperbola(xd, *pars)*np.sin(th)
xcos = xd*np.cos(th)
return e + hyperbola(xcos - hsin, *pars)*np.cos(th) + xcos - hsin
运行它
h0 = 1.1, 1, 0, 5000, 100, .5
h, hcov = curve_fit(rot_hyperbola, x, y, h0)
plt.plot(x, y, 'o', x, two_lines(x, *pw), '-', x, rot_hyperbola(x, *h), '-')
plt.legend(['data', 'piecewise linear', 'rotated hyperbola'], loc='upper left')
plt.show()
我也能够使线 + 指数收敛,但看起来很糟糕.这是因为它不是您数据的良好描述符,它是线性的,而指数与线性相去甚远!
I was also able to get the line + exponential to converge, but it looks terrible. This is because it's not a good descriptor of your data, which is linear and an exponential is very far from linear!
def line_exp(x, a, b, c, d, e):
return a*x + b + c*np.exp((x-d)/e)
e0 = .1, 20., .01, 1000., 2000.
e, ecov = curve_fit(line_exp, x, y, e0)
如果你想保持简单,总会有多项式或样条(分段多项式)
If you want to keep it simple, there's always a polynomial or spline (piecewise polynomials)
from scipy.interpolate import UnivariateSpline
s = UnivariateSpline(x, y, s=x.size) #larger s-value has fewer "knots"
plt.plot(x, s(x))
这篇关于拟合由两种不同状态组成的数据的曲线的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!