在列表列中使用lm使用purrr预测新值 [英] using lm in list column to predict new values using purrr

查看:130
本文介绍了在列表列中使用lm使用purrr预测新值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试向具有包含lm模型的列表列的数据框添加一列预测。我采用了这篇文章中的一些代码。

I am trying to add a column of predictions to a dataframe that has a list column that contains an lm model. I adopted some of the code from this post.

I在这里做了一个玩具示例:

I have made a toy example here:

library(dplyr)
library(purrr)
library(tidyr)
library(broom)

set.seed(1234)

exampleTable <- data.frame(
  ind = c(rep(1:5, 5)),
  dep = rnorm(25),
  groups = rep(LETTERS[1:5], each = 5)
) %>%
group_by(groups) %>%
nest(.key=the_data) %>%
mutate(model = the_data %>% map(~lm(dep ~ ind, data = .))) %>%
mutate(Pred = map2(model, the_data, predict))

exampleTable <- exampleTable %>%
  mutate(ind=row_number())

这让我有点像这样:

# A tibble: 5 × 6
  groups         the_data    model      Pred   ind 
  <fctr>           <list>   <list>    <list> <int> 
1      A <tibble [5 × 2]> <S3: lm> <dbl [5]>     1 
2      B <tibble [5 × 2]> <S3: lm> <dbl [5]>     2 
3      C <tibble [5 × 2]> <S3: lm> <dbl [5]>     3 
4      D <tibble [5 × 2]> <S3: lm> <dbl [5]>     4 
5      E <tibble [5 × 2]> <S3: lm> <dbl [5]>     5 

使用lm模型获得特定群体的预测值,我可以这样:

to get a predicted value using the lm model for a specific group I can use this:

predict(exampleTable[1,]$model[[1]], slice(exampleTable, 1) %>% select(ind))

会产生以下结果:

> predict(exampleTable[1,]$model[[1]], slice(exampleTable, 1) %>% select(ind))
         1 
-0.4822045

我想为每个组有一个新的预测。我尝试使用purrr来获得想要的东西:

I would like to have one new prediction for each group. I tried using purrr to get what I wanted:

exampleTable %>%
  mutate(Prediction = map2(model, ind, predict))

但这会产生以下错误:

Error in mutate_impl(.data, dots) : object 'ind' not found

我能够通过以下怪物得到想要的结果:

I was able to get the result I wanted with the following monstrosity:

exampleTable$Prediction <- NA

for(loop in seq_along(exampleTable$groups)){
  lmod <- exampleTable[loop, ]$model[[1]]
  obs <- filter(exampleTable, row_number()==loop) %>%
    select(ind)
  exampleTable[loop, ] $Prediction <- as.numeric(predict(lmod, obs))
}

这让我有点像这样的小标题:

that gives me a tibble that looks like this:

# A tibble: 5 × 6
  groups         the_data    model      Pred   ind Prediction
  <fctr>           <list>   <list>    <list> <int>      <dbl>
1      A <tibble [5 × 2]> <S3: lm> <dbl [5]>     1 -0.4822045
2      B <tibble [5 × 2]> <S3: lm> <dbl [5]>     2 -0.1357712
3      C <tibble [5 × 2]> <S3: lm> <dbl [5]>     3 -0.2455760
4      D <tibble [5 × 2]> <S3: lm> <dbl [5]>     4  0.4818425
5      E <tibble [5 × 2]> <S3: lm> <dbl [5]>     5 -0.3473236

必须有一种整洁的方式来做,但是我只是

There must be a way to do this in a 'tidy' way, but I just cant crack it.

推荐答案

您可以利用 newdata 参数到预测

我使用 map2_dbl ,因此它返回只是单个值而不是列表。

I use map2_dbl so it returns just the single value rather than a list.

mutate(Pred = map2_dbl(model, 1:5, ~predict(.x, newdata = data.frame(ind = .y))))

# A tibble: 5 x 4
  groups         the_data    model       Pred
  <fctr>           <list>   <list>      <dbl>
1      A <tibble [5 x 2]> <S3: lm> -0.4822045
2      B <tibble [5 x 2]> <S3: lm> -0.1357712
3      C <tibble [5 x 2]> <S3: lm> -0.2455760
4      D <tibble [5 x 2]> <S3: lm>  0.4818425
5      E <tibble [5 x 2]> <S3: lm> -0.3473236

如果将 ind 添加到预测之前的数据集,您可以使用该列而不是 1:5

If you add ind to the dataset before prediction you can use that column instead of 1:5.

mutate(ind = 1:5) %>%
    mutate(Pred = map2_dbl(model, ind, ~predict(.x, newdata = data.frame(ind = .y) )))

# A tibble: 5 x 5
  groups         the_data    model   ind       Pred
  <fctr>           <list>   <list> <int>      <dbl>
1      A <tibble [5 x 2]> <S3: lm>     1 -0.4822045
2      B <tibble [5 x 2]> <S3: lm>     2 -0.1357712
3      C <tibble [5 x 2]> <S3: lm>     3 -0.2455760
4      D <tibble [5 x 2]> <S3: lm>     4  0.4818425
5      E <tibble [5 x 2]> <S3: lm>     5 -0.3473236

这篇关于在列表列中使用lm使用purrr预测新值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆