提取模型摘要并将其存储为新列 [英] Extract model summaries and store them as a new column

查看:84
本文介绍了提取模型摘要并将其存储为新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是purrr范式的新手,并且正在为此苦苦挣扎.

I'm new to the purrr paradigm and am struggling with it.

下面我设法设法嵌套一个数据框,对嵌套的数据运行线性模型,从每个lm提取一些系数,并为每个lm生成一个摘要.我想做的最后一件事是从摘要中提取"r.squared"(我本以为这是我要实现的最简单的部分),但是由于某种原因,我无法获得语法正确的.

Following a few sources I have managed to get so far as to nest a data frame, run a linear model on the nested data, extract some coefficients from each lm, and generate a summary for each lm. The last thing I want to do is extract the "r.squared" from the summary (which I would have thought would be the simplest part of what I'm trying to achieve), but for whatever reason I can't get the syntax right.

以下是我所进行的工作的MWE:

Here's a MWE of what I have that works:

library(purrr)
library(dplyr)
library(tidyr)

mtcars %>%
  nest(-cyl) %>%
  mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
         sum = map(fit, ~summary))

这是我尝试提取失败的r.squared:

and here's my attempt to extract the r.squared which fails:

mtcars %>%
  nest(-cyl) %>%
  mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
         sum = map(fit, ~summary),
         rsq = map_dbl(sum, "r.squared"))

Error in eval(substitute(expr), envir, enclos) : 
  `x` must be a vector (not a closure)

从表面上看,它类似于RStudio网站上给出的示例:

This is superficially similar to the example given on the RStudio site:

mtcars %>%
  split(.$cyl) %>%
  map(~ lm(mpg ~ wt, data = .x)) %>%
  map(summary) %>%
  map_dbl("r.squared")

这是可行的,但是我希望r.squared值位于新列中(因此使用mutate语句),并且我想了解为什么我的代码无法正常工作而不是解决问题.

This works however I would like the r.squared values to sit in a new column (hence the mutate statement) and I'd like to understand why my code isn't working instead of working-around the problem.

这是我使用以下解决方案的可行解决方案:

Here's a working solution that I came to using the solutions below:

mtcars %>%
      nest(-cyl) %>% 
      mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
             summary = map(fit, glance),
             r_sq = map_dbl(summary, "r.squared"))

因此,实际上,发现该错误是由于在summary = map(fit,〜summary)行中包含了波浪号键引起的.我的猜测是,使对象成为嵌套函数,而不是摘要本身返回的对象.如果有人想鸣叫,将很乐意为您提供权威的答案.

So, it actually turns out that the bug is from the inclusion of the tilde key in the summary = map(fit, ~summary) line. My guess is that the makes the object a function which is nest and not the object returned by the summary itself. Would love an authoritative answer on this if someone wants to chime in.

需要明确的是,此版本的原始代码可以正常工作:

To be clear, this version of the original code works fine:

mtcars %>%
  nest(-cyl) %>%
  mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
         summary = map(fit, summary),
         r_sq = map_dbl(summary, "r.squared"))

推荐答案

要适合您当前的管道,您希望将unnest以及broom包中的mapglance一起使用. /p>

To fit in your current pipe, you'd want to use unnest along with map and glance from the broom package.

library(tidyr)
library(dplyr)
library(broom)

mtcars %>%
  nest(-cyl) %>%
  mutate(fit = map(data, ~lm(mpg ~ wt, data = .))) %>% 
  unnest(map(fit, glance))

您不仅可以获得r平方,而且还可以使用select删除不需要的内容.

You'll get more than just the r-squared, and from there you can use select to drop what you don't need.

如果要使模型摘要嵌套在列表列中:

If you want to keep the model summaries nested in list-columns:

mtcars %>%
  nest(-cyl) %>% 
  mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
         summary = map(fit, glance)) 

如果您只想从嵌套框架中提取单个值,则只需使用map作为实际值(而不是我最初建议的[[extract2,非常感谢您发现这一点)

If you want to just extract a single value from a nested frame you just need to use map to the actual value (and not [[ or extract2 as I originally suggested, many thanks for finding that out).

mtcars %>%
  nest(-cyl) %>% 
  mutate(fit = map(data, ~lm(mpg ~ wt, data = .)),
         summary = map(fit, glance),
         r_sq = map_dbl(summary, "r.squared"))

这篇关于提取模型摘要并将其存储为新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆