R-根据训练集和测试集的训练模型,计算测试MSE [英] R - Calculate Test MSE given a trained model from a training set and a test set

查看:1165
本文介绍了R-根据训练集和测试集的训练模型,计算测试MSE的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出两个简单的数据集:

Given two simple sets of data:

 head(training_set)
      x         y
    1 1  2.167512
    2 2  4.684017
    3 3  3.702477
    4 4  9.417312
    5 5  9.424831
    6 6 13.090983

 head(test_set)
      x        y
    1 1 2.068663
    2 2 4.162103
    3 3 5.080583
    4 4 8.366680
    5 5 8.344651

我想在训练数据上拟合一条线性回归线,并使用该线(或系数)在该数据拟合到该数据后,在测试数据上计算测试MSE"或残差的均方误差.

I want to fit a linear regression line on the training data, and use that line (or the coefficients) to calculate the "test MSE" or Mean Squared Error of the Residuals on the test data once that line is fit there.

model = lm(y~x,data=training_set)
train_MSE = mean(model$residuals^2)
test_MSE = ?

推荐答案

在这种情况下,将其称为

In this case, it is more precise to call it MSPE (mean squared prediction error):

mean((test_set$y - predict.lm(model, test_set)) ^ 2)

这是一种更有用的度量,因为所有模型都针对预测.我们想要一个具有最小MSPE的模型.

This is a more useful measure as all models aim at prediction. We want a model with minimal MSPE.

在实践中,如果我们有备用的测试数据集,则可以如上所述直接计算MSPE.但是,很多时候我们没有备用数据.在统计数据中,留一法交叉验证是对训练数据集中的MSPE.

In practice, if we do have a spare test data set, we can directly compute MSPE as above. However, very often we don't have spare data. In statistics, the leave-one-out cross-validation is an estimate of MSPE from the training dataset.

还有其他一些用于评估预测误差的统计信息,例如 Mallows的统计信息 AIC .

There are also several other statistics for assessing prediction error, like Mallows's statistic and AIC.

这篇关于R-根据训练集和测试集的训练模型,计算测试MSE的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆