如何在 Python 中进行指数和对数曲线拟合?我发现只有多项式拟合 [英] How to do exponential and logarithmic curve fitting in Python? I found only polynomial fitting

查看:32
本文介绍了如何在 Python 中进行指数和对数曲线拟合?我发现只有多项式拟合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组数据,我想比较哪条线最能描述它(不同阶的多项式,指数或对数).

我使用 Python 和 Numpy,对于多项式拟合,有一个函数 polyfit().但是我发现没有这样的指数和对数拟合函数.

有吗?或者如何解决?

解决方案

For fit y = A + B log x, 只适合 y 对 (log x).

<预><代码>>>>x = numpy.array([1, 7, 20, 50, 79])>>>y = numpy.array([10, 19, 30, 35, 51])>>>numpy.polyfit(numpy.log(x), y, 1)数组([8.46295607,6.61867463])# y ≈ 8.46 log(x) + 6.62

<小时>

为了拟合y = AeBx,取两边的对数得到log y = 记录 A + Bx.所以适合 (log y) 对抗 x.

请注意,拟合 (log y) 就好像它是线性的一样会强调 y 的小值,从而导致大 y 的大偏差.这是因为polyfit(线性回归)通过最小化∑iY)2 = ∑i (YiŶi)2.当 Yi = log yi 时,残差 ΔYi = Δ(log yi) ≈ Δyi/|yi|.因此,即使 polyfit 对大 y 做出了非常糟糕的决定,divide-by-|y|"factor 会补偿它,导致 polyfit 偏爱小值.

这可以通过给每个条目一个与 y 成比例的权重"来缓解.polyfit 通过 w 关键字参数支持加权最小二乘法.

<预><代码>>>>x = numpy.array([10, 19, 30, 35, 51])>>>y = numpy.array([1, 7, 20, 50, 79])>>>numpy.polyfit(x, numpy.log(y), 1)数组([0.10502711,-0.40116352])# y ≈ exp(-0.401) * exp(0.105 * x) = 0.670 * exp(0.105 * x)# (^ 偏向小值)>>>numpy.polyfit(x, numpy.log(y), 1, w=numpy.sqrt(y))数组([0.06009446,1.41648096])# y ≈ exp(1.42) * exp(0.0601 * x) = 4.12 * exp(0.0601 * x)# (^ 没那么偏向)

请注意,Excel、LibreOffice 和大多数科学计算器通常对指数回归/趋势线使用未加权(有偏)的公式.如果您希望您的结果与这些平台兼容,请不要包括权重,即使它提供了更好的结果.

<小时>

现在,如果您可以使用 scipy,您可以使用

I have a set of data and I want to compare which line describes it best (polynomials of different orders, exponential or logarithmic).

I use Python and Numpy and for polynomial fitting there is a function polyfit(). But I found no such functions for exponential and logarithmic fitting.

Are there any? Or how to solve it otherwise?

解决方案

For fitting y = A + B log x, just fit y against (log x).

>>> x = numpy.array([1, 7, 20, 50, 79])
>>> y = numpy.array([10, 19, 30, 35, 51])
>>> numpy.polyfit(numpy.log(x), y, 1)
array([ 8.46295607,  6.61867463])
# y ≈ 8.46 log(x) + 6.62


For fitting y = AeBx, take the logarithm of both side gives log y = log A + Bx. So fit (log y) against x.

Note that fitting (log y) as if it is linear will emphasize small values of y, causing large deviation for large y. This is because polyfit (linear regression) works by minimizing ∑iY)2 = ∑i (YiŶi)2. When Yi = log yi, the residues ΔYi = Δ(log yi) ≈ Δyi / |yi|. So even if polyfit makes a very bad decision for large y, the "divide-by-|y|" factor will compensate for it, causing polyfit favors small values.

This could be alleviated by giving each entry a "weight" proportional to y. polyfit supports weighted-least-squares via the w keyword argument.

>>> x = numpy.array([10, 19, 30, 35, 51])
>>> y = numpy.array([1, 7, 20, 50, 79])
>>> numpy.polyfit(x, numpy.log(y), 1)
array([ 0.10502711, -0.40116352])
#    y ≈ exp(-0.401) * exp(0.105 * x) = 0.670 * exp(0.105 * x)
# (^ biased towards small values)
>>> numpy.polyfit(x, numpy.log(y), 1, w=numpy.sqrt(y))
array([ 0.06009446,  1.41648096])
#    y ≈ exp(1.42) * exp(0.0601 * x) = 4.12 * exp(0.0601 * x)
# (^ not so biased)

Note that Excel, LibreOffice and most scientific calculators typically use the unweighted (biased) formula for the exponential regression / trend lines. If you want your results to be compatible with these platforms, do not include the weights even if it provides better results.


Now, if you can use scipy, you could use scipy.optimize.curve_fit to fit any model without transformations.

For y = A + B log x the result is the same as the transformation method:

>>> x = numpy.array([1, 7, 20, 50, 79])
>>> y = numpy.array([10, 19, 30, 35, 51])
>>> scipy.optimize.curve_fit(lambda t,a,b: a+b*numpy.log(t),  x,  y)
(array([ 6.61867467,  8.46295606]), 
 array([[ 28.15948002,  -7.89609542],
        [ -7.89609542,   2.9857172 ]]))
# y ≈ 6.62 + 8.46 log(x)

For y = AeBx, however, we can get a better fit since it computes Δ(log y) directly. But we need to provide an initialize guess so curve_fit can reach the desired local minimum.

>>> x = numpy.array([10, 19, 30, 35, 51])
>>> y = numpy.array([1, 7, 20, 50, 79])
>>> scipy.optimize.curve_fit(lambda t,a,b: a*numpy.exp(b*t),  x,  y)
(array([  5.60728326e-21,   9.99993501e-01]),
 array([[  4.14809412e-27,  -1.45078961e-08],
        [ -1.45078961e-08,   5.07411462e+10]]))
# oops, definitely wrong.
>>> scipy.optimize.curve_fit(lambda t,a,b: a*numpy.exp(b*t),  x,  y,  p0=(4, 0.1))
(array([ 4.88003249,  0.05531256]),
 array([[  1.01261314e+01,  -4.31940132e-02],
        [ -4.31940132e-02,   1.91188656e-04]]))
# y ≈ 4.88 exp(0.0553 x). much better.

这篇关于如何在 Python 中进行指数和对数曲线拟合?我发现只有多项式拟合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆