向现有模型提供新数据并使用 broom::augment 添加预测 [英] Feeding new data to existing model and using broom::augment to add predictions

查看:93
本文介绍了向现有模型提供新数据并使用 broom::augment 添加预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 tidyversebroompurrr 按组将模型拟合到某些数据.然后我尝试使用这个模型来预测一些新数据,再次按组.broom 的 'augment' 函数不仅可以很好地添加预测,还可以很好地添加其他值,例如 std 错误等.但是,我无法让 'augment' 函数使用新数据而不是旧数据.结果,我的两组预测完全相同.问题是 - 如何使增强"使用新数据而不是旧数据(用于拟合模型)?

I am using tidyverse,broom, and purrr to fit a model to some data, by group. I am then trying to use this model to predict on some new data, again by group. broom's 'augment' function nicely adds not only the predictions, but also other values like the std error, etc. However, I am unable to make the 'augment' function use the new data instead of the old data. As a result, my two sets of predictions are exactly the same. The question is - how can I make 'augment' use the new data instead of the old data (which was used to fit the model) ?

这是一个可重现的例子:

Here's a reproducible example:

library(tidyverse)
library(broom)
library(purrr)

# nest the iris dataset by Species and fit a linear model
iris.nest <- nest(iris, data = c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)) %>% 
  mutate(model = map(data, function(df) lm(Sepal.Width ~ Sepal.Length, data=df)))

# create a new dataset where the Sepal.Length is 5x as big
newdata <- iris %>% 
  mutate(Sepal.Length = Sepal.Length*5) %>% 
  nest(data = c(Sepal.Length, Sepal.Width, Petal.Length, Petal.Width)) %>% 
  rename("newdata"="data")

# join these two nested datasets together
iris.nest.new <- left_join(iris.nest, newdata)

# now form two new columns of predictions -- one using the "old" data that the model was
# initially fit on, and the second using the new data where the Sepal.Length has been increased
iris.nest.new <- iris.nest.new %>% 
  mutate(preds = map(model, broom::augment),
         preds.new = map2(model, newdata, broom::augment))  # THIS LINE DOESN'T WORK ****
                             
# unnest the predictions on the "old" data
preds <-select(iris.nest.new, preds) %>% 
 unnest(cols = c(preds))
# rename the columns prior to merging
names(preds)[3:9] <- paste0("old", names(preds)[3:9])

# now unnest the predictions on the "new" data
preds.new <-select(iris.nest.new, preds.new) %>% 
 unnest(cols = c(preds.new))
#... and also rename columns prior to merging
names(preds.new)[3:9] <- paste0("new", names(preds.new)[3:9])

# merge the two sets of predictions and compare
compare <- bind_cols(preds, preds.new) 

# compare
select(compare, old.fitted, new.fitted) %>% View(.) # EXACTLY THE SAME!!!!

推荐答案

调用 broom::augment 时,注意 newdata= 参数是第三个参数.当您使用 purr::map2 时,您迭代的值默认在前两个参数中传递.您为传入的这些列表命名并不重要.您需要将新数据显式放置在 newdata= 参数中.

When calling broom::augment, note that the newdata= parameter is the third parameter. When you use purr::map2, the values you iterate over are passed in the first two parameters by default. It doesn't matter what you've named those lists that you are passing in. You need to explicitly place the new data in the newdata= parameter.

iris.nest.new <- iris.nest.new %>% 
  mutate(preds = map(model, broom::augment),
         preds.new = map2(model, newdata, ~broom::augment(.x, newdata=.y)))

运行这两个命令可以看出区别.

The difference can be seen running these two commands.

broom::augment(iris.nest.new$model[[1]], iris.nest.new$newdata[[1]])
broom::augment(iris.nest.new$model[[1]], newdata=iris.nest.new$newdata[[1]])

这篇关于向现有模型提供新数据并使用 broom::augment 添加预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆