如何使用Python确定拟合参数的不确定性? [英] How to determine the uncertainty of fit parameters with Python?
问题描述
我有x和y的以下数据:
I have the following data for x and y:
x y
1.71 0.0
1.76 5.0
1.81 10.0
1.86 15.0
1.93 20.0
2.01 25.0
2.09 30.0
2.20 35.0
2.32 40.0
2.47 45.0
2.65 50.0
2.87 55.0
3.16 60.0
3.53 65.0
4.02 70.0
4.69 75.0
5.64 80.0
7.07 85.0
9.35 90.0
13.34 95.0
21.43 100.0
对于上述数据,我正在尝试以以下形式填写数据:
For the above data, I am trying to fit the data in the form:
但是,存在与x和y相关的某些不确定性,其中x具有x的50%的不确定性,而y具有固定的不确定性.我正在尝试使用此不确定性包确定拟合参数中的不确定性.但是,我在使用scipy Optimizer的曲线拟合函数进行曲线拟合时遇到问题.我收到以下错误:
However, there are certain uncertainties associated with x and y, where x has uncertainty of 50% of x and y has a fixed uncertainty. I am trying to determine the uncertainty in the fit parameters with this uncertainties package. But, I am having issues with curve fitting with scipy optimize's curve fit function. I get the following error:
minpack.error:函数调用的结果不是正确的数组 漂浮.
minpack.error: Result from function call is not a proper array of floats.
如何解决以下错误并确定拟合参数(a,b和n)的不确定性?
How do I fix the following error and determine the uncertainty of the fit parameters (a,b and n)?
MWE
from __future__ import division
import numpy as np
import re
from scipy import optimize, interpolate, spatial
from scipy.interpolate import UnivariateSpline
from uncertainties import unumpy
def linear_fit(x, a, b):
return a * x + b
uncertainty = 0.5
y_error = 1.2
x = np.array([1.71, 1.76, 1.81, 1.86, 1.93, 2.01, 2.09, 2.20, 2.32, 2.47, 2.65, 2.87, 3.16, 3.53, 4.02, 4.69, 5.64, 7.07, 9.35, 13.34, 21.43])
x_uncertainty = x * uncertainty
x = unumpy.uarray(x, x_uncertainty)
y = np.array([0.0, 5.0, 10.0, 15.0, 20.0, 25.0, 30.0, 35.0, 40.0, 45.0, 50.0, 55.0, 60.0, 65.0, 70.0, 75.0, 80.0, 85.0, 90.0, 95.0, 100.0])
y = unumpy.uarray(y, y_error)
n = np.arange(0, 5, 0.005)
coefficient_determination_on = np.empty(shape = (len(n),))
for j in range(len(n)):
n_correlation = n[j]
x_fit = 1 / ((x) ** n_correlation)
y_fit = y
fit_a_raw, fit_b_raw = optimize.curve_fit(linear_fit, x_fit, y_fit)[0]
x_prediction = (fit_a_raw / ((x) ** n_correlation)) + fit_b_raw
y_residual_squares = np.sum((x_prediction - y) ** 2)
y_total_squares = np.sum((y - np.mean(y)) ** 2)
coefficient_determination_on[j] = 1 - (y_residual_squares / y_total_squares)
推荐答案
首先让我作为序言,因为您要解决a
,b
和 n
.这是因为对于固定的n
,您的问题接受封闭式解决方案,而如果让n
处于自由状态,则该问题不存在,实际上,该问题可能有多种解决方案.因此,经典的错误分析(例如uncertanities
所使用的分析)崩溃了,您必须求助于其他方法.
Let me first preface this with this problem being impossible to solve "nicely" given that you want to solve for a
, b
and n
. This is because for a fixed n
, your problem admits a closed form solution, while if you let n
be free, it does not, and in fact the problem may have multiple solutions. Hence classical error analysis (such as that used by uncertanities
) breaks down and you have to resort to other methods.
如果n
已修复,则您的问题是调用的库不支持uarray
,因此您必须采取解决方法.幸运的是,线性拟合(在l2距离下)只是线性最小二乘允许采用封闭形式的解决方案,只需将值填充为一个,然后求解正规方程.
If n
is fixed, your problem is that the libraries you call do not support uarray
, so you have to make a workaround. Thankfully, linear fitting (under the l2-distance) is simply Linear least squares which admits a closed form solution, requiring only padding the values with ones and then solving the normal equations.
位置:
您可以这样做:
import numpy as np
from uncertainties import unumpy
uncertainty = 0.5
y_error = 1.2
n = 1.0
# Define x and y
x = np.array([1.71, 1.76, 1.81, 1.86, 1.93, 2.01, 2.09, 2.20, 2.32, 2.47, 2.65,
2.87, 3.16, 3.53, 4.02, 4.69, 5.64, 7.07, 9.35, 13.34, 21.43])
# Take power of x values according to n
x_pow = x ** n
x_uncertainty = x_pow * uncertainty
x_fit = unumpy.uarray(np.c_[x_pow, np.ones_like(x)],
np.c_[x_uncertainty, np.zeros_like(x_uncertainty)])
y = np.array([0.0, 5.0, 10.0, 15.0, 20.0, 25.0, 30.0, 35.0, 40.0, 45.0, 50.0,
55.0, 60.0, 65.0, 70.0, 75.0, 80.0, 85.0, 90.0, 95.0, 100.0])
y_fit = unumpy.uarray(y, y_error)
# Use normal equations to find coefficients
inv_mat = unumpy.ulinalg.pinv(x_fit.T.dot(x_fit))
fit_a, fit_b = inv_mat.dot(x_fit.T.dot(y_fit))
print('fit_a={}, fit_b={}'.format(fit_a, fit_b))
结果:
fit_a=4.8+/-2.6, fit_b=28+/-10
案件n
未知
在n
未知的情况下,您确实遇到了一些麻烦,因为问题是非凸的.在这里,线性误差分析(由uncertainties
执行)将无法正常工作.
The case n
unknown
With n
unknown, you really are in some trouble since the problem is non-convex. Here, linear error analysis (as performed by uncertainties
) will not work.
One solution is to perform Bayesian inference, using some package like pymc. If you are interested in this, I could try to make a writeup, but it would not be as clean as above.
这篇关于如何使用Python确定拟合参数的不确定性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!