如果参数完全拟合,为什么“curve_fit"不能估计参数的协方差? [英] Why isn't `curve_fit` able to estimate the covariance of the parameter if the parameter fits exactly?
问题描述
我不明白 curve_fit
无法估计参数的协方差,因此引发了下面的 OptimizeWarning
.以下 MCVE 解释了我的问题:
I don't understand curve_fit
isn't able to estimate the covariance of the parameter, thus raising the OptimizeWarning
below. The following MCVE explains my problem:
MCVE python 代码段
from scipy.optimize import curve_fit
func = lambda x, a: a * x
popt, pcov = curve_fit(f = func, xdata = [1], ydata = [1])
print(popt, pcov)
输出
python-3.4.4libsite-packagesscipyoptimizeminpack.py:715:
OptimizeWarning: Covariance of the parameters could not be estimated
category=OptimizeWarning)
[ 1.] [[ inf]]
对于 a = 1
,该函数完全适合 xdata
和 ydata
.为什么错误/方差不是 0
或接近 0
的东西,而是 inf
?
For a = 1
the function fits xdata
and ydata
exactly. Why isn't the error/variance 0
, or something close to 0
, but inf
instead?
引用自 curve_fit
SciPy 参考指南:
There is this quote from the curve_fit
SciPy Reference Guide:
如果解中的雅可比矩阵没有满秩,则 'lm' 方法返回一个填充了 np.inf 的矩阵,另一方面,'trf' 和 'dogbox' 方法使用 Moore-Penrose 伪逆来计算计算协方差矩阵.
If the Jacobian matrix at the solution doesn’t have a full rank, then ‘lm’ method returns a matrix filled with np.inf, on the other hand ‘trf’ and ‘dogbox’ methods use Moore-Penrose pseudoinverse to compute the covariance matrix.
那么,潜在的问题是什么?为什么解处的雅可比矩阵没有满秩?
So, what's the underlying problem? Why doesn't the Jacobian matrix at the solution have a full rank?
推荐答案
参数协方差的公式 (维基百科) 在分母中有自由度数.自由度计算为(数据点数)-(参数数),在您的示例中为 1 - 1 = 0.而这个是SciPy检查在除以它之前的自由度数.
The formula for the covariance of the parameters (Wikipedia) has the number of degrees of freedom in the denominator. The degrees of freedoms are computed as (number of data points) - (number of parameters), which is 1 - 1 = 0 in your example. And this is where SciPy checks the number of degrees of freedom before dividing by it.
使用 xdata = [1, 2], ydata = [1, 2]
你会得到零协方差(注意模型仍然完全拟合:精确拟合不是问题).
With xdata = [1, 2], ydata = [1, 2]
you would get zero covariance (note that the model still fits exactly: exact fit is not the problem).
如果样本大小 N 为 1(样本方差公式的分母为 (N-1)),则这与未定义样本方差属于同一类问题.如果我们只从总体中抽取 size=1 的样本,我们不会将方差估计为零,我们对方差一无所知.
This is the same sort of issue as sample variance being undefined if the sample size N is 1 (the formula for sample variance has (N-1) in the denominator). If we only took size=1 sample out of the population, we don't estimate the variance by zero, we know nothing about the variance.
这篇关于如果参数完全拟合,为什么“curve_fit"不能估计参数的协方差?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!