在 R 中获得回归预测区间的任何简单方法? [英] Any simple way to get regression prediction intervals in R?

查看:54
本文介绍了在 R 中获得回归预测区间的任何简单方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理一个包含超过 30 万个元素的大数据集,并运行一些回归分析,尝试使用预测变量距离来估计一个名为 Rate 的参数.我有回归方程.现在我想获得置信区间和预测区间.我可以通过以下命令轻松获得系数的置信区间:

I am working on a big data set having over 300K elements, and running some regression analysis trying to estimate a parameter called Rate using the predictor variable Distance. I have the regression equation. Now I want to get the confidence and prediction intervals. I can easily get the confidence intervals for the coefficients by the command:

> confint(W1500.LR1, level = 0.95)
              2.5 %      97.5 %
(Intercept) 666.2817393 668.0216072
Distance      0.3934499   0.3946572  

这给了我系数 CI 的上限和下限.现在我想获得相同的预测区间的上限和下限.到目前为止,我唯一学到的是,我可以使用代码获得特定距离值(例如 200、500 等)的预测区间:

which gives me the upper and lower bounds for the CI of the coefficients. Now I want to get the same upper and lower bounds for the Prediction Intervals. Only thing I have learnt so far is that, I can get the prediction intervals for specific values of Distance (say 200, 500, etc.) using the code:

predict(W1500.LR1, newdata, interval="predict")  

这对我没有用,因为我有超过 30 万个不同的距离值,需要为每个值运行此代码.有什么简单的方法可以像我上面显示的 confint 命令那样获得预测间隔?

This is not useful for me because I have over 300K different distance values, requiring to run this code for each of them. Any simple way to get the prediction intervals like the confint command I showed above?

推荐答案

必须自己编造数据,但你来了

Had to make up my own data but here you go

x = rnorm(300000)
y = jitter(3*x,1000)

fit = lm(y~x)

#Prediction intervals
pred.int =  predict(fit,interval="prediction")

#Confidence intervals
conf.int =  predict(fit,interval="confidence")

fitted.values = pred.int[,1]

pred.lower = pred.int[,2]
pred.upper = pred.int[,3]

plot(x[1:1000],y[1:1000])
lines(x[1:1000],fitted.values[1:1000],col="red",lwd=2)
lines(x[1:1000],pred.lower[1:1000],lwd=2,col="blue")
lines(x[1:1000],pred.upper[1:1000],lwd=2,col="blue")

如您所见,您的预测是用于预测新数据值,而不是用于构建 Beta 系数的区间.因此,您实际想要的置信区间将以相同的方式从 conf.int 获得.

So as you can see your prediction is for predicting new data values and not for constructing intervals for the beta coefficients. So the confidence intervals you actually want would be obtained in the same fashion from conf.int.

这篇关于在 R 中获得回归预测区间的任何简单方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆