用scipy拟合步长变化的步长函数优化curve_fit [英] fitting step function with variation in the step location with scipy optimize curve_fit

查看:302
本文介绍了用scipy拟合步长变化的步长函数优化curve_fit的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试拟合看起来像

I am trying to fit x y data which look something like

x = np.linspace(-2, 2, 1000)
a = 0.5
yl = np.ones_like(x[x < a]) * -0.4 + np.random.normal(0, 0.05, x[x < a].shape[0])
yr = np.ones_like(x[x >= a]) * 0.4 + np.random.normal(0, 0.05, x[x >= a].shape[0])
y = np.concatenate((yl, yr))
plt.scatter(x, y, s=2, color='k')

我正在使用Heaviside阶跃函数的变体

I'm using a variation of the Heaviside step function

def f(x, a, b): return 0.5 * b * (np.sign(x - a))

并与

popt, pcov = curve_fit(f, x, y, p0=p)

p是一些初始猜测。
对于任何p曲线拟合都仅适合b而不适用于
,例如:

where p is some initial guess. for any p curve_fit fit only b and not a for example:

popt,pcov = curve_fit(f, x,y,p0 = [-1.0,0])
我们得到popt为[-1。,0.20117665]

popt, pcov = curve_fit(f, x, y, p0=[-1.0, 0]) we get that popt is [-1., 0.20117665]

popt,pcov = curve_fit(f,x,y,p0 = [。5,2])
我们得到的结果是[.5,0.79902]

popt, pcov = curve_fit(f, x, y, p0=[.5, 2]) we get taht popt is [.5, 0.79902]

popt,pcov = curve_fit(f,x,y,p0 = [1.5,-2])
我们得到的结果是[1.5,0.40128229]

popt, pcov = curve_fit(f, x, y, p0=[1.5, -2]) we get taht popt is [1.5, 0.40128229]

为什么curve_fit不适合a?

why curve_fit not fitting a?

推荐答案

如其他人所述, curve_fit (以及 scipy.optimize )可以很好地优化连续变量而不是离散变量。它们都通过对参数值进行较小的更改(例如在1.e-7级别)来进行工作,并查看导致结果的更改(如果有),然后使用该更改来细化这些值,直到最小的残差为找到了。使用 np.sign

As mentioned by others, curve_fit (and all the other solvers in scipy.optimize) work well for optimizing continuous but not discrete variables. They all work by making small (like, at the 1.e-7 level) changes to the parameter values and seeing what (if any) change that makes in the result, and using that change to refine those values until the smallest residual is found. With your model function using np.sign:

def f(x, a, b): return 0.5 * b * (np.sign(x - a))

a 的微小变化完全不会改变模型或拟合结果。也就是说,首先拟合将尝试使用 a = -1.0 a = 0.5 的起始值,然后尝试 a = -0.999999995 a = 0.500000005 。它们对于 np.sign(x-a)都将给出相同的结果。拟合不知道需要将 a 更改为1才能对结果产生影响。它不知道这一点。 np.sign() np.sin()相差一个字母,但是在这方面的表现却大不相同。

such a small change in the value of a will not change the model or fit result at all. That is, first the fit will try the starting value of, say, a=-1.0 or a=0.5, and then will try a=-0.999999995 or a=0.500000005. Those will both give the same result for np.sign(x-a). The fit does not know that it would need to change a by 1 to have any effect on the result. It cannot know this. np.sign() and np.sin() differ by one letter, but behave very differently in this respect.

对于实际数据,通常采取一个步骤,但要对其进行足够精细的采样,以使该步骤不会一步一步完成,这是很常见的。在这种情况下,您将能够使用各种功能形式(线性斜坡,误差函数,反正切,逻辑等)对步骤进行建模。 @JamesPhilipps的详尽回答给出了一种方法。我可能会使用 lmfit (是其主要作者之一),并且愿意通过查看数据来猜测参数的起始值,也许是:

It is pretty common for real data to take a step but to be sampled finely enough so that the step does not happen completely in one step. In that case, you would be able to model the step with a variety of functional forms (linear ramp, error function, arc-tangent, logistic, etc). The thorough answer from @JamesPhilipps gives one approach. I would probably use lmfit (being one of its main authors) and be willing to guess starting values for the parameters from looking at the data, perhaps:

import numpy as np

x = np.linspace(-2, 2, 1000)
a = 0.5
yl = np.ones_like(x[x < a]) * -0.4 + np.random.normal(0, 0.05, x[x < a].shape[0])
yr = np.ones_like(x[x >= a]) * 0.4 + np.random.normal(0, 0.05, x[x >= a].shape[0])
y = np.concatenate((yl, yr))

from lmfit.models import StepModel, ConstantModel

model = StepModel() + ConstantModel()
params = model.make_params(center=0, sigma=1, amplitude=1., c=-0.5)

result = model.fit(y, params, x=x)

print(result.fit_report())

import matplotlib.pyplot as plt
plt.scatter(x, y, label='data')
plt.plot(x, result.best_fit, marker='o', color='r', label='fit')
plt.show()

这将非常适合并打印出resul ts of

which would give a good fit and print out results of

[[Model]]
    (Model(step, form='linear') + Model(constant))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 50
    # data points      = 1000
    # variables        = 4
    chi-square         = 2.32729556
    reduced chi-square = 0.00233664
    Akaike info crit   = -6055.04839
    Bayesian info crit = -6035.41737
##  Warning: uncertainties could not be estimated:
[[Variables]]
    amplitude:  0.80013762 (init = 1)
    center:     0.50083312 (init = 0)
    sigma:      4.6009e-04 (init = 1)
    c:         -0.40006255 (init = -0.5)

请注意,它将找到该步骤的中心,因为它假定该步骤有一定的宽度( sigma ),但随后发现宽度小于 x 中的步长。但也请注意,它无法计算参数的不确定性,因为如上所述, center (您的 a )在解决方案附近不会更改结果拟合。 FWIW StepModel 可以使用线性,误差函数,反正切或逻辑函数作为步进函数。

Note that it will find the center of the step because it assumed there was some finite width (sigma) to the step, but then found that width to be smaller than the step size in x. But also note that it cannot calculate the uncertainties in the parameters because, as above, a small change in center (your a) near the solution does not change the resulting fit. FWIW the StepModel can use a linear, error-function, arc-tangent, or logistic as the step function.

如果您构建的测试数据的步长较小,则用
表示类似

If you had constructed the test data to have a small width to the step, say with something like

from scipy.special import erf    
y = 0.638  * erf((x-0.574)/0.005)  + np.random.normal(0, 0.05, len(x))

然后,拟合将能够找到最佳解决方案并评估不确定性。

then the fit would have been able to find the best solution and evaluate the uncertainties.

我希望这解释了为什么与模型函数的拟合无法细化 a 的值,以及可以做什么关于它。

I hope that explains why the fit with your model function could not refine the value of a, and what might be done about it.

这篇关于用scipy拟合步长变化的步长函数优化curve_fit的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆