scipy curve_fit 奇怪的结果 [英] scipy curve_fit strange result

查看:70
本文介绍了scipy curve_fit 奇怪的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 scipy 的 curve_fit 拟合分布.我试图拟合一个单分量指数函数,结果几乎是一条直线(见图).我还尝试了一个两分量指数拟合,它似乎工作得很好.两个分量只是意味着方程的一部分以不同的输入参数重复.无论如何,这里是一个组件拟合函数:

I am trying to fit a distribution with scipy's curve_fit. I tried to fit a one component exponential function which resulted in an almost straight line (see figure). I also tried a two component exponential fit which seemed to work nicely. Two components just means that a part of the equation repeats with different input parameters. Anyway, here is the one component fit function:

def Exponential(Z,w0,z0,Z0):
    z = Z - Z0
    termB = (newsigma**2 + z*z0) / (numpy.sqrt(2.0)*newsigma*z0)
    termA = (newsigma**2 - z*z0) / (numpy.sqrt(2.0)*newsigma*z0)
    return w0/2.0 * numpy.exp(-(z**2 / (2.0*newsigma**2))) * (numpy.exp(termA**2)*erfc(termA) + numpy.exp(termB**2)*erfc(termB))

拟合完成

fitexp = curve_fit(Exponential,newx,y2)

然后我尝试了一些东西,只是为了尝试一下.我取了两个分量拟合的两个参数,但没有在计算中使用.

Then I tried something, just to try it out. I took two parameters of the two component fit, but did not use them in the calculation.

def ExponentialNew(Z,w0,z0,w1,z1,Z0):
    z = Z - Z0
    termB = (newsigma**2 + z*z0) / (numpy.sqrt(2.0)*newsigma*z0)
    termA = (newsigma**2 - z*z0) / (numpy.sqrt(2.0)*newsigma*z0)
    return w0/2.0 * numpy.exp(-(z**2 / (2.0*newsigma**2))) * (numpy.exp(termA**2)*erfc(termA) + numpy.exp(termB**2)*erfc(termB))

突然就可以了.

现在,我的公式是.为什么?如您所见,拟合的计算绝对没有区别.它只是获得了两个未使用的额外变量.这不应该得到相同的结果吗?

Now, my quation is. WHY? As you can see, there is absolutely no difference in the calculation of the fit. It just gets two extra variables that are not used. Should this not get the same result?

@Andras_Deak一个实际例子:

@Andras_Deak An actual example:

from scipy.special import erfc
import numpy
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

#setup data
x = [-58.,-54.,-50.,-46.,-42.,-38.,-34.,-30.,-26.,-22.,-18.,-14.,-10.,-6.,-2.,2.,6.,10.,14.,18.,22.,26.,30.,34.,38.,42.,46.,50.,54.,58.]
y = [23.06763817, 16.89802085, 17.83258379, 16.63446237, 13.81878965, 12.97965839, 14.30451789, 16.98288216, 22.26811491, 28.56756908, 33.06990344, 38.59842098, 54.19860393, 86.37381604, 137.47253315, 199.49724512, 238.66047662, 219.89405445, 160.68820199, 103.88901303, 65.92405727, 43.84596266, 31.5395342, 25.9610156, 22.71683709, 18.06740651, 13.85362374, 11.12867065, 10.36502799, 11.31855619]
y_err = [17.9823065, 4.13684885, 1.66490726, 2.4109372, 2.93359141, 1.9701747, 3.19214881,  3.65593012, 2.89089074, 3.58922121, 4.25505348, 4.72728874, 6.77736567, 11.3888196, 21.87771722, 39.0087495, 56.6910311, 51.7592369, 26.39750958, 10.62678862, 7.85893395, 8.11741621, 7.91731416, 7.07739132, 5.41818744, 6.11286843, 8.27070757, 7.85323065, 4.26885499, 0.9047867]

#function to fit
def Exponential2(Z, w0, z0, w1, z1, Z0):
    z = Z - Z0
    s = 3.98098937586
    a = z**2 / (2.0*s**2)
    b = (s**2 + z*z0) / (numpy.sqrt(2.0)*s*z0)
    c = (s**2 - z*z0) / (numpy.sqrt(2.0)*s*z0)
    d = (s**2 + z*z1) / (numpy.sqrt(2.0)*s*z1)
    e = (s**2 - z*z1) / (numpy.sqrt(2.0)*s*z1)
    return w0/2.0 * numpy.exp(-a) * (numpy.exp(c**2)*erfc(c) + numpy.exp(b**2)*erfc(b)) + w1/2.0 * numpy.exp(-a) * (numpy.exp(e**2)*erfc(e) + numpy.exp(d**2)*erfc(d))


#derive and set initial guess
ymaxpos = x[numpy.where(y==numpy.max(y))[0]]
p0_2 = [numpy.max(y),5,numpy.max(y)/2.0,20,ymaxpos]

#fit
fitexp2 = curve_fit(Exponential2,x,y,p0=p0_2,sigma=y_err)

#get results
w0err = numpy.sqrt(numpy.diag(fitexp2[1]))[0]
z0err = numpy.sqrt(numpy.diag(fitexp2[1]))[1]
w1err = numpy.sqrt(numpy.diag(fitexp2[1]))[2]
z1err = numpy.sqrt(numpy.diag(fitexp2[1]))[3]
w0 = fitexp2[0][0]
z0 = fitexp2[0][1]
w1 = fitexp2[0][2]
z1 = fitexp2[0][3]
Z0 = fitexp2[0][4]
#new x array for smoother curve
smoothx = numpy.arange(-58,59,0.1)
y2 = Exponential2(smoothx,w0,z0,w1,z1,Z0)

print 'Exponential 2: w0: '+str(w0.round(3))+' +/- '+str(w0err.round(3))+' \t z0: '+str(z0.round(3))+' +/- '+str(z0err.round(3))+' \t w1: '+str(w1.round(3))+' +/- '+str(w1err.round(3))+' \t\t z1: '+str(z1.round(3))+' +/- '+str(z1err.round(3))

#plot
fig = plt.figure()
ax = fig.add_subplot(111)
ax.errorbar(x,y,y_err,fmt='o',markersize=2,label='data')
ax.plot(smoothx,y2,label='fit',color='red')
ax.grid()
ax.legend()
plt.show()

如您所见,情节确实看起来不错,但返回的值 z1 完全不切实际.

As you can see, the plot does look good, but the returned value z1 is totaly unrealistic.

Exponential 2: w0: 312.608 +/- 36.764    z0: 8.263 +/- 1.158     w1: 12.689 +/- 9.138        z1: 1862257.883 +/- 45201809883.8

推荐答案

根据我的经验 curve_fit 有时会起作用并坚持参数的初始值.我怀疑在您的情况下,添加一些假参数会改变相关参数初始化方式的启发式方法(尽管这与文档中的声明相矛盾,即没有给出初始值,它们都默认为 1).

In my experience curve_fit can sometimes act up and stick with the initial values for the parameters. I would suspect that in your case adding a few fake parameters changed the heuristics of how the relevant parameters are being initialized (although this contradicts the documentation's statement that with no initial values given, they all default to 1).

如果您为拟合参数指定合理的边界和初始值(我的意思是 p0bounds 关键字),这对获得可靠的拟合有很大帮助.默认起始值​​都应该是 1 的事实表明,对于大多数用例,默认值不会削减它.

It helps a lot in obtaining reliable fits if you specify reasonable bounds and initial values for your fitting parameters (I mean the p0 and bounds keywords). The fact that the default starting values should all be 1 suggests that for most use cases, the default won't cut it.

这篇关于scipy curve_fit 奇怪的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆