使用scipy curve_fit正确拟合,包括x中的错误? [英] Correct fitting with scipy curve_fit including errors in x?

查看:245
本文介绍了使用scipy curve_fit正确拟合,包括x中的错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用scipy.optimize.curve_fit将直方图与其中的一些数据拟合.如果要在y中添加错误,可以简单地通过将weight应用于合适的位置来进行.但是,如何在x中应用错误(即,由于在直方图的情况下合并导致的错误)?

I'm trying to fit a histogram with some data in it using scipy.optimize.curve_fit. If I want to add an error in y, I can simply do so by applying a weight to the fit. But how to apply the error in x (i. e. the error due to binning in case of histograms)?

当用curve_fitpolyfit进行线性回归时,我的问题也适用于x中的错误;我知道如何在y中添加错误,但在x中不是.

My question also applies to errors in x when making a linear regression with curve_fit or polyfit; I know how to add errors in y, but not in x.

这里有一个示例(部分来自 matplotlib文档):

Here an example (partly from the matplotlib documentation):

import numpy as np
import pylab as P
from scipy.optimize import curve_fit

# create the data histogram
mu, sigma = 200, 25
x = mu + sigma*P.randn(10000)

# define fit function
def gauss(x, *p):
    A, mu, sigma = p
    return A*np.exp(-(x-mu)**2/(2*sigma**2))

# the histogram of the data
n, bins, patches = P.hist(x, 50, histtype='step')
sigma_n = np.sqrt(n)  # Adding Poisson errors in y
bin_centres = (bins[:-1] + bins[1:])/2
sigma_x = (bins[1] - bins[0])/np.sqrt(12)  # Binning error in x
P.setp(patches, 'facecolor', 'g', 'alpha', 0.75)

# fitting and plotting
p0 = [700, 200, 25]
popt, pcov = curve_fit(gauss, bin_centres, n, p0=p0, sigma=sigma_n, absolute_sigma=True)
x = np.arange(100, 300, 0.5)
fit = gauss(x, *popt)
P.plot(x, fit, 'r--')

现在,这种拟合(当它不会失败时)确实考虑了y误差sigma_n,但是我还没有找到一种使它考虑sigma_x的方法.我扫描了scipy邮件列表上的几个线程,发现如何使用absolute_sigma值,并在Stackoverflow上发表了有关

Now, this fit (when it doesn't fail) does consider the y-errors sigma_n, but I haven't found a way to make it consider sigma_x. I scanned a couple of threads on the scipy mailing list and found out how to use the absolute_sigma value and a post on Stackoverflow about asymmetrical errors, but nothing about errors in both directions. Is it possible to achieve?

推荐答案

scipy.optmize.curve_fit使用标准的非线性最小二乘法优化,因此仅使响应变量的偏差最小.如果要考虑自变量中的错误,可以尝试使用正交距离回归的scipy.odr.顾名思义,它最小化了自变量和因变量.

scipy.optmize.curve_fit uses standard non-linear least squares optimization and therefore only minimizes the deviation in the response variables. If you want to have an error in the independent variable to be considered you can try scipy.odr which uses orthogonal distance regression. As its name suggests it minimizes in both independent and dependent variables.

看看下面的示例. fit_type参数确定scipy.odr是执行完整ODR(fit_type=0)还是最小二乘优化(fit_type=2).

Have a look at the sample below. The fit_type parameter determines whether scipy.odr does full ODR (fit_type=0) or least squares optimization (fit_type=2).

编辑

尽管该示例可行,但意义不大,因为y数据是根据嘈杂的x数据计算得出的,这只会导致独立变量的间距不相等.我更新了示例,现在还显示了如何使用RealData来指定数据的标准误差而不是权重.

Although the example worked it did not make much sense, since the y data was calculated on the noisy x data, which just resulted in an unequally spaced indepenent variable. I updated the sample which now also shows how to use RealData which allows for specifying the standard error of the data instead of the weights.

from scipy.odr import ODR, Model, Data, RealData
import numpy as np
from pylab import *

def func(beta, x):
    y = beta[0]+beta[1]*x+beta[2]*x**3
    return y

#generate data
x = np.linspace(-3,2,100)
y = func([-2.3,7.0,-4.0], x)

# add some noise
x += np.random.normal(scale=0.3, size=100)
y += np.random.normal(scale=0.1, size=100)

data = RealData(x, y, 0.3, 0.1)
model = Model(func)

odr = ODR(data, model, [1,0,0])
odr.set_job(fit_type=2)
output = odr.run()

xn = np.linspace(-3,2,50)
yn = func(output.beta, xn)
hold(True)
plot(x,y,'ro')
plot(xn,yn,'k-',label='leastsq')
odr.set_job(fit_type=0)
output = odr.run()
yn = func(output.beta, xn)
plot(xn,yn,'g-',label='odr')
legend(loc=0)

这篇关于使用scipy curve_fit正确拟合,包括x中的错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆