如何定量测量SciPy的拟合优度? [英] How to quantitatively measure goodness of fit in SciPy?

查看:155
本文介绍了如何定量测量SciPy的拟合优度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想找出最适合给定数据的方法.我要做的是遍历n的各种值,并使用公式((y_fit-y_actual)/y_actual)x 100计算每个p的残差.然后,我计算每个n的平均值,然后找到最小残差均值以及相应的n值,并使用该值进行拟合.可复制的代码包括:

I am tying to find out the best fit for data given. What I did is I loop through various values of n and calculate the residual at each p using the formula ((y_fit - y_actual) / y_actual) x 100. Then I calculate the average of this for each n and then find the minimum residual mean and the corresponding n value and fit using this value. A reproducible code included:

import numpy as np
import matplotlib.pyplot as plt
from scipy import optimize   

x = np.array([12.4, 18.2, 20.3, 22.9, 27.7, 35.5, 53.9])
y = np.array([1, 50, 60, 70, 80, 90, 100])
y_residual = np.empty(shape=(1, len(y)))
residual_mean = []

n = np.arange(0.01, 10, 0.01)

def fit(x, a, b):
    return a * x + b
for i in range (len(n)):
    x_fit = 1 / np.log(x) ** n[i]
    y_fit = y
    fit_a, fit_b = optimize.curve_fit(fit, x_fit, y_fit)[0]
    y_fit = (fit_a * x_fit) + fit_b
    y_residual = (abs(y_fit - y) / y) * 100
    residual_mean = np.append(residual_mean, np.mean(y_residual[np.isfinite(y_residual)]))
p = n[np.where(residual_mean == residual_mean.min())]
p = p[0]
print p
x_fit = 1 / np.log(x) ** p
y_fit = y
fit_a, fit_b = optimize.curve_fit(fit, x_fit, y_fit)[0]
y_fit = (fit_a * x_fit) + fit_b
y_residual = (abs(y_fit - y) / y) * 100

fig = plt.figure(1, figsize=(5, 5))
fig.clf()
plot = plt.subplot(111)
plot.plot(x, y, linestyle = '', marker='^')
plot.plot(x, y_fit, linestyle = ':')
plot.set_ylabel('y')
plot.set_xlabel('x')
plt.show()

fig_1 = plt.figure(2, figsize=(5, 5))
fig_1.clf()
plot_1 = plt.subplot(111)
plot_1.plot(1 / np.log(x) ** p, y, linestyle = '-')
plot_1.set_xlabel('pow(x, -p)' )
plot_1.set_ylabel('y' )
plt.show()

fig_2 = plt.figure(2, figsize=(5, 5))
fig_2.clf()
plot_2 = plt.subplot(111)
plot_2.plot(n, residual_mean, linestyle = '-')
plot_2.set_xlabel('n' )
plot_2.set_ylabel('Residual mean')
plt.show()

用n绘制残差均值,这就是我得到的:

Plotting residual mean with n, this is what I get:

我需要知道此方法是否正确才能确定最佳拟合.并且如果可以使用SciPy或其他任何软件包中的某些其他功能来完成此操作.从本质上讲,我想要定量地确定哪种方法最合适.我已经经历过 SciPy中的拟合优度测试,但这并没有帮助我很多.

I need to know if this method is correct to determine the best fit. And if it can be done with some other functions in SciPy or any other packages. In essence what I want is to quantitatively know which is the best fit. I already went through Goodness of fit tests in SciPy but it didn't help me much.

推荐答案

可能最常用的拟合优度度量是确定系数(又称为 R 2 值).

Probably the most commonly used goodness-of-fit measure is the coefficient of determination (aka the R2 value).

公式为:

其中:

在这里, y i 指的是您输入的y值, f i 指的是您输入的y值. -values和̅y表示平均输入y值.

Here, yi refers to your input y-values, fi refers to your fitted y-values, and ̅y refers to the mean input y-value.

这很容易计算:

# residual sum of squares
ss_res = np.sum((y - y_fit) ** 2)

# total sum of squares
ss_tot = np.sum((y - np.mean(y)) ** 2)

# r-squared
r2 = 1 - (ss_res / ss_tot)

这篇关于如何定量测量SciPy的拟合优度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆