使用curve_fit将曲线拟合到幂律分布不起作用 [英] Fitting a curve to a power-law distribution with curve_fit does not work

查看:603
本文介绍了使用curve_fit将曲线拟合到幂律分布不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图找到一条拟合我的数据的曲线,该曲线在视觉上似乎具有幂律分布.

I am trying to find a curve fitting my data that visually seem to have a power law distribution.

我希望利用scipy.optimize.curve_fit,但是无论我尝试使用什么函数或数据规范化,我都会遇到RuntimeError(找不到参数或溢出)或一条曲线,甚至无法远程拟合我的数据的情况.请帮助我找出我在这里做错了什么.

I hoped to utilize scipy.optimize.curve_fit, but no matter what function or data normalization I try, I am getting either a RuntimeError (parameters not found or overflow) or a curve that does not fit my data even remotely. Please help me to figure out what I am doing wrong here.

%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

df = pd.DataFrame({
            'x': [ 1000, 3250, 5500, 10000, 32500, 55000, 77500, 100000, 200000 ],
            'y': [ 1100, 500, 288, 200, 113, 67, 52, 44, 5 ]
        })
df.plot(x='x', y='y', kind='line', style='--ro', figsize=(10, 5))

def func_powerlaw(x, m, c, c0):
    return c0 + x**m * c

target_func = func_powerlaw

X = df['x']
y = df['y']

popt, pcov = curve_fit(target_func, X, y)

plt.figure(figsize=(10, 5))
plt.plot(X, target_func(X, *popt), '--')
plt.plot(X, y, 'ro')
plt.legend()
plt.show()

输出

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-243-17421b6b0c14> in <module>()
     18 y = df['y']
     19 
---> 20 popt, pcov = curve_fit(target_func, X, y)
     21 
     22 plt.figure(figsize=(10, 5))

/Users/evgenyp/.virtualenvs/kindle-dev/lib/python2.7/site-packages/scipy/optimize/minpack.pyc in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, bounds, method, **kwargs)
    653         cost = np.sum(infodict['fvec'] ** 2)
    654         if ier not in [1, 2, 3, 4]:
--> 655             raise RuntimeError("Optimal parameters not found: " + errmsg)
    656     else:
    657         res = least_squares(func, p0, args=args, bounds=bounds, method=method,

RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800.

推荐答案

作为回溯状态,在没有找到固定点的情况下达到了函数评估的最大数量(以终止算法).您可以使用选项maxfev增加最大数量.对于此示例,设置maxfev=2000足够大,可以成功终止算法.

As the traceback states, the maximum number of function evaluations was reached without finding a stationary point (to terminate the algorithm). You can increase the maximum number using the option maxfev. For this example, setting maxfev=2000 is large enough to successfully terminate the algorithm.

但是,解决方案并不令人满意.这是由于算法为变量选择了(默认)初始估计值,在此示例中,该估计值不好(需要大量的迭代次数才能表明这一点).提供另一个初始化点(通过简单的尝试和错误即可找到)非常适合,而无需增加maxfev.

However, the solution is not satisfactory. This is due to the algorithm choosing a (default) initial estimate for the variables, which, for this example, is not good (the large number of iterations required is an indicator of this). Providing another initialization point (found by simple trial and error) results in a good fit, without the need to increase maxfev.

这两个拟合以及与数据的视觉比较如下所示.

The two fits and a visual comparison with the data is shown below.

x = np.asarray([ 1000, 3250, 5500, 10000, 32500, 55000, 77500, 100000, 200000 ])
y = np.asarray([ 1100, 500, 288, 200, 113, 67, 52, 44, 5 ])

sol1 = curve_fit(func_powerlaw, x, y, maxfev=2000 )
sol2 = curve_fit(func_powerlaw, x, y, p0 = np.asarray([-1,10**5,0]))

这篇关于使用curve_fit将曲线拟合到幂律分布不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆