scipy.optimize.curvefit()-数组不得包含infs或NaN [英] scipy.optimize.curvefit() - array must not contain infs or NaNs

查看:242
本文介绍了scipy.optimize.curvefit()-数组不得包含infs或NaN的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 scipy.optimize.curve_fit 。我遇到错误 ValueError:数组不能包含infs或NaNs

我不相信我的 x y 数据包含infs或NaN:

I don't believe either my x or y data contain infs or NaNs:

>>> x_array = np.asarray_chkfinite(x_array)
>>> y_array = np.asarray_chkfinite(y_array)
>>>

对我的 x_array y_array 看起来像是两端( x_array 是计数, y_array 是分位数):

To give some idea of what my x_array and y_array look like at either end (x_array is counts and y_array is quantiles):

>>> type(x_array)
<type 'numpy.ndarray'>
>>> type(y_array)
<type 'numpy.ndarray'>
>>> x_array[:5]
array([0, 0, 0, 0, 0])
>>> x_array[-5:]
array([2919, 2965, 3154, 3218, 3461])
>>> y_array[:5]
array([ 0.9999582,  0.9999163,  0.9998745,  0.9998326,  0.9997908])
>>> y_array[-5:]
array([  1.67399000e-04,   1.25549300e-04,   8.36995200e-05,
     4.18497600e-05,  -2.22044600e-16])

我的函数

>>> def func(x,alpha,beta,b):
...    return ((x/1)**(-alpha) * ((x+1*b)/(1+1*b))**(alpha-beta))
...

我要执行的操作:

>>> popt, pcov = curve_fit(func, x_array, y_array)

导致错误堆栈跟踪:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/scipy/optimize/minpack.py", line 426, in curve_fit
res = leastsq(func, p0, args=args, full_output=1, **kw)
File "/usr/lib/python2.7/dist-packages/scipy/optimize/minpack.py", line 338, in leastsq
cov_x = inv(dot(transpose(R),R))
File "/usr/lib/python2.7/dist-packages/scipy/linalg/basic.py", line 285, in inv
a1 = asarray_chkfinite(a)
File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 590, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

我是猜测错误可能不是针对我的数组,而是由scipy在中间步骤中创建的数组?我已经对相关的scipy源
文件进行了一些挖掘,但是事情很快就变得很棘手,以这种方式调试问题。我在这里做错什么明显吗?我已经在其他问题中随意提及,有时某些初始参数猜测(我目前还没有任何明确的猜测)可能会导致此类错误,但是即使是这种情况,知道 a)为什么要这么做以及 b)如何避免这种情况。

I'm guessing the error might not be with respect to my arrays, but rather an array created by scipy in an intermediate step? I've had a bit of a dig through the relevant scipy source files, but things get hairy pretty quickly debugging the problem that way. Is there something obvious I'm doing wrong here? I've seen casually mentioned in other questions that sometimes certain initial parameter guesses (of which I currently don't have any explicit) might result in these kind of errors, but even if this is the case, it would be good to know a) why that is and b) how to avoid it.

推荐答案

为什么失败

Why it is failing

不是您的输入数组包含 nans infs ,但是在某些X点和某些参数结果值中对目标函数的求值在 nans infs 中:换句话说,值 func(x,alpha ,beta,b)给出x,alpha,beta和b的 nans infs

Not your input arrays are entailing nans or infs, but evaluation of your objective function at some X points and for some values of the parameters results in nans or infs: in other words, the array with values func(x,alpha,beta,b) for some x, alpha, beta and b is giving nans or infs over the optimization routine.

Scipy.optimize曲线拟合函数使用Levenberg-Marquardt算法。也称为阻尼最小二乘优化。这是一个迭代过程,每次迭代都会计算出最佳功能参数的新估计值。同样,在优化过程中的某个时刻,算法正在探索未定义函数的参数空间的某些区域。

Scipy.optimize curve fitting function uses Levenberg-Marquardt algorithm. It is also called damped least square optimization. It is an iterative procedure, and a new estimate for the optimal function parameters is computed at each iteration. Also, at some point during optimization, algorithm is exploring some region of the parameters space where your function is not defined.

如何修复

How to fix

1 /初始猜测

对参数的初始猜测对收敛起决定性作用。如果最初的猜测距离最佳解决方案还很远,那么您更有可能探索一些目标函数未定义的区域。因此,如果您可以更好地了解最佳参数是什么,并以这种最初的猜测来提供算法,则可以避免在继续过程中出错。

Initial guess for parameters is decisive for the convergence. If initial guess is far from optimal solution, you are more likely to explore some regions where objective function is undefined. So, if you can have a better clue of what your optimal parameters are, and feed your algorithm with this initial guess, error while proceeding might be avoided.

2 /模型

此外,您可以修改模型,以使其不返回 nans 。对于这些参数的值, params ,其中未定义原始函数 func ,您希望目标函数采用巨大的值,即 func(params)远不能满足Y值。

Also, you could modify your model, so that it is not returning nans. For those values of the parameters, params where original function func is not defined, you wish that objective function takes huge values, or in other words that func(params) is far from Y values to be fitted.

此外,在未定义目标函数的点,您可能会返回一个较大的浮点数,例如 AVG(Y)* 10e5 和AVG的平均值(因此请确保要比要拟合的Y值的平均值大得多)。

Also, at points where your objective function is not defined, you may return a big float, for instance AVG(Y)*10e5 with AVG the average (so that you make sure to be much bigger than average of Y values to be fitted).

链接

Link

您可以查看此帖子:

You could have a look at this post: Fitting data to an equation in python vs gnuplot

这篇关于scipy.optimize.curvefit()-数组不得包含infs或NaN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆