有没有一种方法可以“压缩" lm()对象以进行以后的预测? [英] Is there a way to 'compress' an lm() object for later prediction?

查看:203
本文介绍了有没有一种方法可以“压缩" lm()对象以进行以后的预测?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有一种压缩" lm类的对象的方法,以便可以将其保存到磁盘中,并稍后加载以供predict.lm使用?

Is there a way to 'compress' an object of class lm, so that I can save it to the disk and load it up later for use with predict.lm?

我有一个lm对象,保存后最终达到142mb,而且我很难相信预报.lm需要所有原始观测值/拟合值/残差等来进行线性预测.我可以删除信息以使保存的模型更小吗?

I have an lm object that ends up being ~142mb upon saving, and I have a hard time believing that predict.lm needs all of the original observations / fitted values / residuals etc. to make a linear prediction. Can I remove information so that the saved model is smaller?

我尝试将某些变量(fitted.values,residuals等)设置为NA,但这似乎对保存的文件大小没有影响.

I have tried setting some of the variables (fitted.values, residuals, etc.) to NA, but it seems to have no effect on the saved file size.

推荐答案

您可以使用biglm拟合模型,biglm模型对象小于lm模型对象.您可以使用predict.biglm创建一个函数,该函数可以将newdata设计矩阵传递至该函数,该函数将返回预测值.

You can use biglm to fit your models, a biglm model object is smaller than a lm model object. You can use predict.biglm create a function that you can pass the newdata design matrix to, which returns the predicted values.

另一种选择是使用saveRDS保存文件,这些文件看起来比较小,因为它们作为单个对象的开销较小,而不是像save那样可以保存多个对象.

Another option is to use saveRDS to save the files, which appear to be slightly smaller, as they have less overhead, being a single object, not like save which can save multiple objects.

 library(biglm)
 m <- lm(log(Volume)~log(Girth)+log(Height), trees)
 mm <- lm(log(Volume)~log(Girth)+log(Height), trees, model = FALSE, x =FALSE, y = FALSE)
 bm <- biglm(log(Volume)~log(Girth)+log(Height), trees)
 pred <- predict(bm, make.function = TRUE)
 save(m, file = 'm.rdata')
 save(mm, file = 'mm.rdata')
 save(bm, file = 'bm.rdata')
 save(pred, file = 'pred.rdata')
 saveRDS(m, file = 'm.rds')
 saveRDS(mm, file = 'mm.rds')
 saveRDS(bm, file = 'bm.rds')
 saveRDS(pred, file = 'pred.rds')

 file.info(paste(rep(c('m','mm','bm','pred'),each=2) ,c('.rdata','.rds'),sep=''))
#             size isdir mode mtime               ctime               atime               exe
#  m.rdata    2806 FALSE  666 2013-03-07 11:29:30 2013-03-07 11:24:23 2013-03-07 11:29:30  no
#  m.rds      2798 FALSE  666 2013-03-07 11:29:30 2013-03-07 11:29:30 2013-03-07 11:29:30  no
#  mm.rdata   2113 FALSE  666 2013-03-07 11:29:30 2013-03-07 11:24:28 2013-03-07 11:29:30  no
#  mm.rds     2102 FALSE  666 2013-03-07 11:29:30 2013-03-07 11:29:30 2013-03-07 11:29:30  no
#  bm.rdata    592 FALSE  666 2013-03-07 11:29:30 2013-03-07 11:24:34 2013-03-07 11:29:30  no
#  bm.rds      583 FALSE  666 2013-03-07 11:29:30 2013-03-07 11:29:30 2013-03-07 11:29:30  no
#  pred.rdata 1007 FALSE  666 2013-03-07 11:29:30 2013-03-07 11:24:40 2013-03-07 11:29:30  no
#  pred.rds    995 FALSE  666 2013-03-07 11:29:30 2013-03-07 11:27:30 2013-03-07 11:29:30  no

这篇关于有没有一种方法可以“压缩" lm()对象以进行以后的预测?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆