R - 给定训练集和测试集的训练模型,计算测试 MSE [英] R - Calculate Test MSE given a trained model from a training set and a test set

查看:71
本文介绍了R - 给定训练集和测试集的训练模型,计算测试 MSE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定两组简单的数据:

 head(training_set)
      x         y
    1 1  2.167512
    2 2  4.684017
    3 3  3.702477
    4 4  9.417312
    5 5  9.424831
    6 6 13.090983

 head(test_set)
      x        y
    1 1 2.068663
    2 2 4.162103
    3 3 5.080583
    4 4 8.366680
    5 5 8.344651

我想在训练数据上拟合线性回归线,并使用该线(或系数)计算测试 MSE"或测试数据上残差的均方误差,一旦该线在那里拟合.

I want to fit a linear regression line on the training data, and use that line (or the coefficients) to calculate the "test MSE" or Mean Squared Error of the Residuals on the test data once that line is fit there.

model = lm(y~x,data=training_set)
train_MSE = mean(model$residuals^2)
test_MSE = ?

推荐答案

这种情况下,更准确的称呼是 MSPE(均方预测误差):

In this case, it is more precise to call it MSPE (mean squared prediction error):

mean((test_set$y - predict.lm(model, test_set)) ^ 2)

这是一个更有用的度量,因为所有模型都旨在预测.我们想要一个具有最小 MSPE 的模型.

This is a more useful measure as all models aim at prediction. We want a model with minimal MSPE.

在实践中,如果我们有一个备用的测试数据集,我们可以直接像上面一样计算 MSPE.然而,很多时候我们没有备用数据.在统计学中,留一法交叉验证是对来自训练数据集的 MSPE.

In practice, if we do have a spare test data set, we can directly compute MSPE as above. However, very often we don't have spare data. In statistics, the leave-one-out cross-validation is an estimate of MSPE from the training dataset.

还有其他几种用于评估预测误差的统计数据,例如 Mallows 的统计数据AIC.

There are also several other statistics for assessing prediction error, like Mallows's statistic and AIC.

这篇关于R - 给定训练集和测试集的训练模型,计算测试 MSE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆